About The Position

PathAI's mission is to improve patient outcomes with AI-powered pathology. Our platform promises substantial improvements to the accuracy of diagnosis and the efficacy of treatment of diseases like cancer, leveraging modern approaches in machine learning and artificial intelligence. We have a track record of success in deploying AI algorithms for histopathology in translational research, pathology labs and clinical trials. Rigorous science and careful analysis is critical to the success of everything we do. Our team, composed of diverse employees with a wide range of backgrounds and experiences, is passionate about solving challenging problems and making a huge impact on patient outcomes. Where You Fit As the Asspcoate Director, MLOps Lead, you will lead the team responsible for the backbone of our AI/ML Stack: the infrastructure that bridges ML research and massive-scale production. Your primary directive is to evolve our stack to meet the next scale of needs in large scale ML training & inference workloads. You’re someone who enjoys designing and building for reliability, relishes collaboration and technical challenges, and takes pride in making things better – without taking yourself too seriously. Our technical space is broad: high-scale AI training & inference workloads, cloud infrastructure, Kubernetes, observability, distributed systems, and a bit of everything in between.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • 2-3+ years of experience managing engineering team(s), with a focus on building production-grade frameworks for MLOps or ML Infrastructure.
  • Deep technical expertise with ML workloads on kubernetes, cloud computing platforms (AWS/GCP/Azure), workflow orchestration (Airflow, Kubeflow, or proprietary equivalents) and DevOps principles and infrastructure-as-code (Helm, Terraform).
  • Proven experience managing petabyte-scale datasets and high-throughput production inference pipelines.
  • Strong software engineering skills in complex, multi-language systems and experience with scalable service architecture.
  • Use of AI assistants (e.g. CoPilot, Cursor, Claude) across platform development lifecycle.

Nice To Haves

  • Exposure to ML frameworks like PyTorch or Scikit-learn.
  • Experience with large-scale data processing frameworks (e.g. Spark, Hive, Databricks, Amazon EMR)
  • Expertise in MLOps principles, including model lifecycle management, feature stores, model monitoring, and CI/CD for ML.
  • Familiarity with security and compliance best practices in ML systems.

Responsibilities

  • Vision and Roadmap: Develop and execute the long term vision & roadmap for MLOPs team to support ML development and deployment needs across the business units. Successfully manage the tension between short-term tactical deliveries and long-term architectural transformation for future growth.
  • Team Management: Lead and mentor a team of 6-7+ high-performing engineers. Strategically allocate resources to manage support for existing services while executing key strategic initiatives.
  • Cross-Functional Collaboration: Partner with leaders across machine learning, data science, product engineering, and infrastructure to proactively identify pain points, address bottlenecks, and facilitate the deployment of new solutions.
  • Foundation Model Readiness: Architect the compute and storage pipelines required for ML Engineers to manage millions of slides and complex derived artifacts without data fragmentation or synchronization latency.
  • Inference Modernization: Modernize the AI Product inference stack to support 5-10x growth of AI runs across global deployments.
  • System Observability: Collaborate with Site Reliability Engineering (SRE) to establish comprehensive metrics covering compute under-utilization, network bottlenecks, and granular cost and turn-around-time attribution.
  • Technology Refresh: Conduct "Build vs. Buy" assessments, leading "Stack Refresh" audits to benchmark our proprietary tools against best-in-class commercial and open-source alternatives to meet our future needs.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service