Data Scientist, NLP

Data Scientist, NLP

This job is no longer open

Your Future Evolves Here

Evolent Health has a bold mission to change the health of the nation by changing the way health care is delivered. Evolenteers make a difference wherever they are, whether it is at a medical center, in the office, or while working from home across 48 states. We empower you to work from where you work best, which makes juggling careers, families, and social lives so much easier. Through our recognition programs, we also highlight employees who live our values, give back to our communities each year, and are champions for bringing their whole selves to work each day. If you’re looking for a place where your work can be personally and professionally rewarding, don’t just join a company with a mission. Join a mission with a company behind it.

Why We’re Worth the Application:

  • We continue to grow year over year.
  • Recognized as a leader in driving important diversity, equity, and inclusion (DE&I) efforts.
  • Achieved a 100% score two years in a row on the Human Rights Campaign's Corporate Equality Index recognizing us as a best place to work for LGBTQ+ equality.
  • Named to’s list of the best companies for women to advance for 3 years in a row (2020, 2021 and 2022).
  • Continue to prioritize the employee experience and achieved a 90% overall engagement score on our employee survey in May 2022.
  • Publish an annual DE&I report to share our progress on how we’re building an equitable workplace.

What You’ll Be Doing:

Position summary

The Data Scientist, NLP will support building of AI products in Agile fashion that empower healthcare payers, providers and members to quickly process medical data to making informed decisions and overall reduce health care costs. As a research scientist/engineer and part of the Data Science and Artificial Intelligence team, you will be working primarily on unstructured text data to build machine learning models for information retrieval/processing applications.  These applications include but are not limited to optical character recognition, understanding the contents of the medical documents using natural language processing with help of large language models, and integrating processes into the overall AI pipeline to mine healthcare and medical information with high recall and other relevant metrics. We ingest claims, medical charts, etc. from providers containing unstructured data which will be transformed into structured data to support automated entry into our storage layers for downstream applications and automated assessment against medical policies. The results will be used dually for real-time operational processes with both automated and augmented human-based decision making (i.e., copilot tools) to reduce healthcare administrative costs. We work with all major cloud and big data vendors offerings including but not limited to (Azure, AWS, Google, IBM, etc.) to achieve AI goals in healthcare and support Evolent business.  

Essential Functions & Qualifications

 The Data Scientist, NLP will have the opportunity to work within a team, shape team culture, and operating norms as a result of the fast-paced nature of a high-growth organization. 

  • 2+ years of Industry experience primarily related to Unstructured Text Data and NLP (PhD work and internships will be considered if they are related to unstructured text in lieu of industry experience).
  • Essential functions Unix/Linux background and experience with at least one of the following cloud vendors like AWS, Azure, and Google.
  • Develop Natural Language Medical/Healthcare documents comprehension related products to support Evolent Health business objectives, products and improve processing efficiency, reducing overall healthcare costs.
  • Gather external data sets; build synthetic data and label data sets as per the needs for NLP/NLR/NLU.  
  • Apply expert software engineering skills to build Natural Language products to improve automation and improve user experiences leveraging unstructured data storage, Entity Recognition, POS Tagging, ontologies, taxonomies, data mining, information retrieval techniques, machine learning approach, distributed and cloud computing platforms.
  • Enhance the Natural Language and Text Mining products — from platforms to systems for model training, versioning, deploying, storage and testing models with creating real time feedback loops to fully automated services
  • Research, design, and implement innovative approaches utilizing state of the art Transformer models, large language models (like BERT, OpenAI) and generative AI techniques to solve complex healthcare challenges.
  • Thorough understanding of deep learning architectures and hands on experience with one or more frameworks like tensorflow, pytorch, keras, trax
  • Hands on experience with libraries and tools like Spacy, NLTK, Stanford core NLP, Genism, johnsnowlabs.
  • Staying up-to-date with advancements in AI, NLP and generative modelling to ensure our solutions remain stable and state-of-the-art.
  • Rapidly prototyping ideas, approaches, and methods in AI Space.
  • Experience with data visualization tools to communicate insights effectively.
  • Work closely and collaborate with Data Scientists, Machine Learning engineers, IT teams and Business stakeholders spread across various locations in US and India to achieve business goals.
  • Strong understanding of mathematical concepts including but not limited to linear algebra, Advanced calculus, partial differential equations and statistics including Bayesian approaches.
  • Strong programming experience including understanding of concepts in data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture.
  • Good understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification and regressions, SVM, trees, model evaluations.
  • Additional course work, projects, research participation and/or publications in Natural Language processing, reasoning and understanding, information retrieval, text mining, search, computational linguistics, ontologies, semantics.  
  • Experience with developing and deploying products with experience in one or more of the following languages: Python (preferred), C++, Java, Scala.
  • Understanding business use cases and be able to translate them to team with a vision on how to implement.
  • Identify enhancements and build best practices that can help to improve the productivity of the team.

Academic Qualifications:

  • Bachelor’s degree or above in Computer Science, Computational Linguistics, Mathematics, Physics, or Electrical Engineering

Nice to Have

  • Master’s or PhD degree in Computer Science, Artificial Intelligence, Computational Linguistics, Machine Learning, or related technical field from a strong academic program.
  • Medical concepts with codes from standard ontologies (SNOMED CT, LOINC, RxNorm, ICD, etc.).
  • Experience with Huggingface Library.
  • Lucene, Solr, Elastic Search experience.
  • Experience with Kubernetes and dockers. 
  • Experience building REST API’s for AI work and knowledge of microservices architecture.
  • Participation in open source community projects.    
  • Publication record in NLP conferences (NIPS, ICLR, ACL, NAACL, EMNLP, SIGIR, WWW, etc.).
  • Hands on experience with one or more of high-performance computing and distributed computing like Spark, Dask, Hadoop, CUDA distributed GPU.

Technical Requirements:

We require that all employees have the following technical capability at their home: High speed internet over 10 Mbps and, specifically for all call center employees, the ability to plug in directly to the home internet router. These at-home technical requirements are subject to change with any scheduled re-opening of our office locations. 

Evolent Health is an equal opportunity employer and considers all qualified applicants equally without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, or disability status.

Compensation Range: The minimum salary for this position is $100,000, plus benefits. Salaries are determined by the skill set required for the position and commensurate with experience and may vary above and below the stated amounts.
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.