hero

Portfolio Careers

Build your career at the best companies in healthcare and fintech
Oak HC/FT

Sr Data Scientist (NLP/Task Mining)

CLARA analytics

CLARA analytics

Software Engineering, Data Science
Posted on Friday, May 3, 2024

Senior Data Scientist (NLP/Task Mining)

About Us

Clara Analytics' mission is to give insurance claims managers the power to improve outcomes dramatically, save millions of dollars, and help thousands of people recover from injuries, recoup damages, and get back on their feet faster after suffering a loss. The opportunities to develop high-value AI and machine learning applications in the insurance industry are nearly unbounded - the sector has only just begun to realize the benefits of adopting these technologies. We are looking for high performers with an entrepreneurial spirit who welcome the challenges and opportunities of thinking big in uncharted territory and are driven to solve challenging problems and develop cutting-edge applications that have a significant impact.

Job Description

We are seeking a highly skilled and experienced Senior Data Scientist with expertise in task mining and information extraction from unstructured data to join our team. The ideal candidate will have a strong background in machine learning, deep learning, graph analytics, as well as NLP and task mining techniques. You will utilize your expertise to build a timeline of task-based events related to insurance claims, including entities and other information, and to identify patterns in events across industry-wide data.

Responsibilities

  • Design and implement data mining and natural language processing (NLP) algorithms to extract tasks, events, entities, relationships, and other relevant information from unstructured insurance claims and medical record text data.
  • Develop machine learning models for task detection, classification, clustering, and temporal analysis.
  • Conduct exploratory data analysis to understand the characteristics and patterns of insurance claims and medical record text data, identify key features and signals indicative of tasks, and inform the design of task mining algorithms.
  • Utilize graph analytics techniques to represent relationships between tasks, and other relevant entities as graphs, and perform graph-based analysis for task detection, clustering, and visualization.
  • Implement entity linking algorithms to map textual mentions to entities in knowledge bases or ontologies and resolve entity ambiguities and synonyms to ensure accurate information extraction.
  • Stay abreast of the latest research and advancements in machine learning and incorporate relevant techniques and methodologies into our solutions.
  • Work closely with our product and engineering teams to ensure clear functional requirements that enable us to design and develop new applications and features.
  • Conduct code and model reviews of your peers, providing actionable feedback to ensure a high standard of quality.

Minimum Qualifications

  • MS degree in a quantitative discipline (e.g., machine learning, computer science, statistics, mathematics, physics).
  • 5+ years of experience developing machine learning models.
  • 3+ years of practical experience in natural language processing, plus a strong academic background
  • Hands on experience with text preprocessing, named entity recognition, entity linking, task mining, graph analytics techniques and graph databases (Neo4j).
  • Solid understanding of NLP fundamentals including word embeddings, sequence-to-sequence models, attention mechanisms, and transformer architectures.
  • Ability to assess the pros and cons of different ML methods and algorithms, break problems down into standard tasks and prototype quickly.
  • Demonstrated ability to communicate complex quantitative concepts effectively to various audiences of varying technical proficiency

Preferred Qualifications

  • PhD in quantitative discipline as described above.
  • 2+ years of relevant work experience in the insurance or medical industries, including development and implementation of production quality machine learning applications.
  • Experience coupling language models with other tools and technologies (knowledge graphs, domain-specific ontologies, etc.).