Senior ML Engineer I

Waystar•Louisville, KY

About The Position

We are seeking a highly skilled and innovative Senior ML Engineer with a passion for building robust, efficient, and domain-specific AI systems using Language Models (LMs) and agentic architectures. As a core member of the team, you will be instrumental in developing the entire ML pipeline, from sophisticated data extraction techniques to fine-tuning specialized LMs and orchestrating their interactions within a multi-agent framework. This is a unique opportunity to apply state-of-the-art Generative AI and NLP techniques to a real-world, high-impact problem, leveraging the latest research in agentic AI and LMs to deliver economical and powerful solutions.

Requirements

Bachelor's or Master's degree in Computer Science, Machine Learning, Artificial Intelligence, or a related quantitative field.
3+ years of professional experience in Machine Learning Engineering, with a strong focus on NLP.
Proven experience with Language Models (LMs), including model selection, fine-tuning, and deployment.
Strong proficiency in Python and familiarity with ML frameworks (e.g., PyTorch, TensorFlow, Hugging Face Transformers).
Solid understanding and hands-on experience with core NLP techniques and architectures, especially Transformers.
Experience with cloud platforms, particularly Google Cloud Platform (GCP), including services like Vertex AI, Cloud Storage, and compute services.
Familiarity with MLOps principles and tools for model serving, monitoring, and pipeline automation.
Excellent problem-solving skills, attention to detail, and ability to work independently and collaboratively.
Active use of artificial intelligence (AI) tools and techniques to enhance performance, drive innovation, and improve decision-making across business functions.
Ability to leverage AI tools and platforms to streamline workflows, improve decision-making, and drive innovation.
Curiosity and adaptability in exploring emerging AI technologies, with a mindset for continuous learning and experimentation.

Nice To Haves

Hands-on experience building or contributing to agentic AI systems or multi-agent frameworks.
Direct experience with document processing technologies such as OCR, layout parsing, Document AI, or custom information extraction from unstructured text.
Experience with Vector Databases (e.g., pgvector, Pinecone, Weaviate, Qdrant) and RAG architectures.
Exposure to the healthcare domain, particularly understanding medical terminology, CPT/ICD codes, or regulatory documents.

Responsibilities

Design, implement, and optimize robust pipelines for ingesting, parsing, and extracting structured information from complex documents (leveraging OCR, document layout analysis, Named Entity Recognition (NER), and Relationship Extraction (RE).
Develop rich, nested JSON schemas for representing structured data and ensure scalable storage
Generate and manage high-quality vector embeddings for efficient retrieval-augmented generation (RAG) within a Vector Database.
Research, select, and experiment with appropriate open-source Language Models (Large & Small) (e.g., Phi-3, Mistral, Llama, Nemotron-H families) for specialized tasks.
Design and execute efficient fine-tuning strategies (e.g., LoRA, QLoRA, full fine-tuning) on curated, domain-specific datasets to achieve precise performance for tasks like coverage determination, code lookups, and policy rule application.
Explore and implement knowledge distillation techniques to transfer capabilities from larger models to smaller, more efficient LMs.
Build and maintain the core agentic framework, including the orchestrator that intelligently routes queries and coordinates interactions between various specialized LM tools.
Develop and integrate "tools" (specialized LMs and external APIs) that perform atomic medical necessity tasks, ensuring strict behavioral alignment and structured outputs.
Deploy, manage, and monitor LMs and agentic components on Google Cloud Platform (GCP) using services like Vertex AI, GKE, Cloud Functions, and Cloud Run.
Implement robust MLOps practices for continuous integration, continuous delivery (CI/CD), model versioning, and performance monitoring (latency, throughput, accuracy).
Establish effective feedback loops from end-user interactions and system logs to identify areas for model improvement.
Curate and expand training datasets, ensuring data privacy (PHI/PII masking) and legal compliance.
Stay abreast of the latest research in LMs, agentic AI, NLP, and document understanding, applying relevant advancements to our system.
Work closely with subject matter experts, product managers, and other engineers to translate complex requirements into technical solutions and evaluate system performance.

Benefits

Competitive total rewards (base salary + bonus, if applicable)
Customizable benefits package (3 medical plans with Health Saving Account company match)
We offer generous paid time off for our non-exempt team members, starting with 3 weeks + 13 paid holidays, including 2 personal floating holidays.
We also offer flexible time off for our exempt team members + 13 paid holidays
Paid parental leave (including maternity + paternity leave)
Education assistance opportunities and free LinkedIn Learning access
Free mental health and family planning programs, including adoption assistance and fertility support
401(K) program with company match
Pet insurance
Employee resource groups