AI / ML Engineer

Third Way Health•Cambridge, MA

93d

About The Position

We're seeking a Senior ML Engineer to build next-generation AI systems that help millions of patients access care faster. You'll architect production ML infrastructure handling thousands of hours of service interactions daily in a highly regulated healthcare environment. This is a high-impact individual contributor role—ideal for someone eager to “own the outcome” and push the boundaries of “high tech + high touch” care experiences.

Requirements

5+ years of software engineering experience, with 3+ years focused on machine learning or applied AI systems.
Strong proficiency in Python, particularly for ML pipelines, frameworks, inference services, and APIs (e.g., scikit-learn, Sanic API, PyTorch Lightning, Pydantic AI, LangGraph, Bedrock, OpenAI / Anthropic SDKs).
Experience designing ML-centric data architectures, including feature stores, vector databases, and time-series systems for monitoring and analytics.
Hands-on experience with cloud-native inference: containerized model serving, autoscaling, GPU/accelerator workloads, and low-latency production deployments.
Experience operating end-to-end MLOps platforms (e.g., MLflow, Kubeflow), including CI/CD for models, experiment tracking, and rollout strategies.
Solid understanding of workflow orchestration (graph-based execution, retries, state management) in ML and agent-based systems.
Excellent communication skills, with the ability to collaborate effectively across engineering, product, and non-technical stakeholders.
Strong interest in healthcare innovation and building AI systems that meaningfully improve health outcomes.
Working knowledge of AI safety, bias detection, and responsible AI practices.

Nice To Haves

Experience building AI systems in healthcare or regulated environments, with familiarity with standards such as HIPAA, GDPR, or FDA guidance.
Proven experience leading complex technical initiatives and mentoring junior engineers.
Strong applied knowledge of event-driven architectures and streaming systems (Kafka, Pub/Sub, Kinesis, RabbitMQ).
Hands-on experience designing and operating vector search, RAG pipelines, and hybrid retrieval systems.
Experience with agent frameworks, multi-agent coordination patterns, and long-running agent loops in production environments.
Familiarity with real-time analytics stacks combining streaming data, ML inference, and operational dashboards.

Responsibilities

Architect and build large-scale AI systems that integrate high-volume voice, text, and contextual event streams with extensive knowledge bases to deliver real-time recommendations, automations, and decision support.
Design and operate workflow-oriented AI systems, including DAG-based execution graphs, stateful pipelines, and agent-driven workflows with clear observability, reproducibility, and fault tolerance.
Build agent architectures spanning agent-to-agent coordination, feedback loops, tool-calling systems, and long-running autonomous workflows, balancing control, safety, and adaptability.
Design and implement data models, feature pipelines, and APIs to support model training, low-latency inference, and continuous learning.
Develop predictive, real-time analytics systems that combine streaming data, ML inference, and event-driven triggers to surface insights and automate actions at scale.
Implement and maintain end-to-end ML platforms, including model training, evaluation, deployment, online inference, monitoring, and drift detection.
Partner closely with product managers, data scientists, and QA engineers to translate experimental models into reliable, production-grade AI services.
Identify, diagnose, and resolve performance and scaling bottlenecks across data pipelines, inference services, and orchestration layers as production workloads grow.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume