hero

Portfolio Careers

Build your career at the best companies in healthcare and fintech

Senior Data Lakehouse Engineer

CLARA analytics

CLARA analytics

Posted on Dec 11, 2024

Sr Data Engineer (Lakehouse)

About CLARA

CLARA Analytics is the leading AI as a service (AIaaS) provider that improves casualty claims outcomes for commercial insurance carriers and self-insured organizations. The company’s product suite for workers comp, commercial auto and general liability insurance claims applies image recognition, natural language processing, and other AI-based techniques to unlock insights from medical notes, bills and other documents surrounding a claim. CLARA’s customers include companies from the top 25 global insurance carriers to large third-party administrators and self-insured organizations. Founded in 2017, CLARA Analytics is headquartered in California’s Silicon Valley. For more information, visit www.claraanalytics.com, and follow the company on LinkedIn and Twitter.

This is a chance to get in early with a rapidly growing Insuretech company, and to participate in developing the next generation of truly game-changing products in the Insurance Industry. CLARA Analytics is a leader in providing the most technologically advanced solutions in the insurance industry, dedicated to expanding the horizons of what is possible. Our mission is to drive the best claims outcomes for both insurers and the insured using innovative machine learning, including deep learning and natural language processing. Our models provide key insights and predictions that guide the claims adjuster in making optimal decisions at every step of the claim process.

Clara is looking for a player coach Data Engineer that can work with product, business analysts, and greater data engineers as well deliver production worthy code.

Key Responsibilities

  • Design, implement, and optimize the Data Lakehouse architecture, ensuring high performance, scalability, and reliability for both batch and real-time data processing.
  • Optimize the performance of data pipelines and queries, leveraging the hybrid architecture of the lakehouse.
  • Manage the integration of various data sources into the Data Lakehouse, including structured, semi-structured, and unstructured data.
  • Collaborate with greater data engineering, data science, and business teams to define data storage needs and ensure that data models align with business requirements.
  • Define and enforce data governance, quality standards, and security policies to ensure data integrity and compliance with industry standards.
  • Monitor workflow performance, reliability, and ensure SLA targets are met.
  • Automate existing code and processes using scripting, CI/CD, infrastructure-as-code and configuration management tools.
  • Troubleshoot data-related issues and improve overall system performance and security..
  • Work with big data tools such as Apache Spark, Delta Lake, and other cloud-based technologies to process large datasets.
  • Support and maintain the Data Lakehouse environment, ensuring its stability and reliability.
  • Stay current with trends in big data, cloud technologies, and Data Lakehouse architecture best practices.

Skills Knowledge and Expertise

  • 7+ years of hands-on experience with Redshift (preferred), Snowflake, Google BigQuery, or similar cloud-based data processing and warehousing solutions.
  • Experience with AWS Lake Formation
  • 3+ hands-on experience developing ELT/ETL solutions
  • Proven experience (3-5 years) in designing, implementing, and managing Data Lakehouse architectures, with knowledge of Delta Lake, Apache Hudi or Iceberg, or other related technologies.
  • Fluent in at least one programming language required (Python or Java preferred, SQL accepted).
  • High proficiency in SQL programming with relational databases. Experience writing complex SQL queries is a must.
  • Proficient in data modeling techniques and concepts for both structured and unstructured data.
  • Experience with data pipeline (using Hive, Spark, Spark SQL etc. is preferred), DAG orchestration and workflow management tools: Airflow, AWS step functions etc.
  • Experience with containerization and orchestration tools (e.g., Docker, Kubernetes)
  • Experience leading small teams delivering code and features supporting product and analytics
  • Knowledge of machine learning workflows and integration with data lakes/lakehouses is a plus.
  • Experience with data visualization or BI tools is a plus (Superset, Chart.js, AWS QuickSight, Highcharts)
  • Strong knowledge of data governance, security, and privacy policies, AWS Lake formation is a plus
  • Strong communication skills, with the ability to collaborate across technical and business teams.

Nice to Have

  • AWS certifications (AWS Certified Solutions Architect, Developer, or DevOps).
  • Knowledge of commercial claims management systems.

What We Offer

  • The opportunity to make a real impact on a growing company.
  • Work on challenging and rewarding projects that will push your technical skills.
  • Collaborative and supportive work environment.
  • Competitive salary and benefits package.
  • Be a part of a team that is passionate about what we do.

Ready to Join?

We are looking for a passionate and talented engineer to join our team. If you are excited about building innovative software and helping us achieve our mission, we encourage you to apply!