ML/LLM Operations Engineer

Evolent

About The Position

Evolent partners with health plans and providers to achieve better outcomes for people with most complex and costly health conditions. Working across specialties and primary care, we seek to connect the pieces of fragmented health care system and ensure people get the same level of care and compassion we would want for our loved ones. Evolent employees enjoy work/life balance, the flexibility to suit their work to their lives, and autonomy they need to get things done. We believe that people do their best work when they're supported to live their best lives, and when they feel welcome to bring their whole selves to work. That's one reason why diversity and inclusion are core to our business. Join Evolent for the mission. Stay for the culture. We are seeking a skilled ML/LLM Operations Engineer to join our Data Science team at Evolent Health to ensure our AI systems deliver consistent, reliable, and compliant results in healthcare settings. This role is perfect for someone who thrives at the intersection of machine learning, operations, and healthcare compliance. The role combines deep understanding of LLM behavior and evaluation with a meticulous approach to monitoring, quality assurance, and regulatory compliance in healthcare applications. This position will play a critical role partnering with our Data Science and Engineering teams while also interacting with cross-functional organizations including DevOps, Compliance, Quality Assurance, Clinical Support, and Product Management to ensure our AI systems operate reliably and meet all healthcare industry requirements.

Requirements

Bachelor's or master's degree in computer science, data science, or related field
2+ years of experience with Python development and at least one production LLM implementation
Strong proficiency in SQL for complex log analysis and metrics generation
Demonstrated experience with LLM APIs and frameworks (experience with PydanticAI, LangChain, or similar)
Experience with monitoring tools and practices for AI systems, including performance metrics, drift detection, and alerting
Understanding of LLM behavior, prompt engineering, and common failure modes in production
Experience building evaluation or testing frameworks for AI/ML systems
Strong communication skills for cross-functional collaboration
Experience with healthcare AI applications and compliance requirements is preferred
Familiarity with multiple LLM providers (OpenAI, Anthropic, Google, Azure) is preferred
Knowledge of Pydantic ecosystem including PydanticAI and Logfire is preferred
Understanding of LLM evaluation metrics and methodologies is preferred
Experience building tools for non-technical users is preferred
Basic knowledge of containerization (Docker) for local testing and development is preferred
Experience with cloud environments (AWS, Azure) as a user is preferred
Understanding of API rate limiting, quota management, and cost optimization strategies is preferred
Knowledge of CI/CD concepts for ML model deployments is preferred
Experience with regulatory compliance and audit processes is preferred
Excellent documentation skills and attention to detail is preferred
High speed internet over 10 Mbps and, specifically for all call center employees, the ability to plug in directly to the home internet router.
All candidates must complete a comprehensive background check, in-person I-9 verification, and may be subject to drug screening prior to employment.

Nice To Haves

Experience with healthcare AI applications and compliance requirements is preferred
Familiarity with multiple LLM providers (OpenAI, Anthropic, Google, Azure) is preferred
Knowledge of Pydantic ecosystem including PydanticAI and Logfire is preferred
Understanding of LLM evaluation metrics and methodologies is preferred
Experience building tools for non-technical users is preferred
Basic knowledge of containerization (Docker) for local testing and development is preferred
Experience with cloud environments (AWS, Azure) as a user is preferred
Understanding of API rate limiting, quota management, and cost optimization strategies is preferred
Knowledge of CI/CD concepts for ML model deployments is preferred
Experience with regulatory compliance and audit processes is preferred
Excellent documentation skills and attention to detail is preferred

Responsibilities

Develop and maintain standardized evaluation frameworks to consistently measure LLM performance across relevant healthcare metrics
Build monitoring systems using Logfire to track AI model performance, detect drift, and alert the team to anomalies
Create testing infrastructure for prompt versions, model selection, and quality assurance processes
Design and implement audit sampling processes for continuous quality monitoring and clinical review workflows
Oversee regulatory compliance processes, including documentation for bias assessments, model cards, and audit trails required in healthcare
Optimize LLM operations through intelligent model selection, prompt engineering, and cost management strategies
Support the transition from successful POCs to production-ready services with appropriate testing and validation
Partner with DevOps on infrastructure requirements while focusing on AI-specific monitoring and optimization
Create and maintain documentation, runbooks, and operational procedures for all deployed AI systems
Collaborate with Clinical Support Liaison to incorporate clinical feedback into system improvements
Prepare regular reports on AI system quality, performance metrics, and compliance status