MLOps Engineer

PepsiCoAlbuquerque, NM
Hybrid

About The Position

We are seeking a Mid-Level MLOps Engineer to build, operate, and evolve our Kubeflow-based ML platform on Azure. This role focuses on enabling reliable, scalable, and cost-efficient ML workflows by designing CI/CD pipelines, managing Kubernetes-based ML infrastructure, improving platform observability, and supporting MLE and Data Science teams across the model lifecycle. The ideal candidate is hands-on, comfortable working across infrastructure and ML workflows, and motivated to operationalize best practices in MLOps.

Requirements

  • 3–6 years of experience in MLOps, DevOps, or Platform Engineering
  • Hands-on experience with Kubeflow, Kubernetes, Terraform (IaC), and containerized ML workloads
  • Strong experience with Azure cloud services (AKS, ACR, Storage, Networking, IAM, AD Groups)
  • Proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch, scikit-learn)
  • Experience building CI/CD pipelines (GitHub Actions, Azure DevOps, Argo, etc.)
  • Understanding of ML lifecycle management (training, inference, monitoring, retraining)
  • Familiarity with observability tools (Prometheus, Grafana, Azure Monitor, DataDog)
  • Strong collaboration and communication skills

Responsibilities

  • Deploy, configure, and operate Kubeflow components on Azure Kubernetes Service (AKS)
  • Support Kubernetes workloads for training, inference, and batch pipelines
  • Manage container images, registries, and ML runtime environments
  • Assist with Kubeflow and Kubernetes upgrades under senior guidance
  • Build and maintain CI/CD pipelines for ML workflows and platform services
  • Automate model training, validation, and deployment pipelines
  • Implement reproducibility and versioning for data, models, and pipelines
  • Implement logging, monitoring, and alerting at the platform level
  • Diagnose and resolve workflow, pipeline, and infrastructure failures
  • Support SLAs and reliability objectives for ML platforms
  • Work closely with MLEs and Data Scientists to onboard workflows onto Kubeflow
  • Provide best practices, templates, and documentation for ML teams
  • Collaborate with Infra and Security teams on access control and compliance needs
  • Assist with collecting and reporting costs at Kubeflow namespace or workflow level
  • Identify optimization opportunities related to compute usage and scheduling

Benefits

  • Hybrid work model: combination of remote and collaborative office experience to enable innovation
  • Entrepreneurial environment in leading international company
  • Professional growth possibilities & learning opportunities
  • Variety of benefits to support your physical, emotional and financial wellbeing
  • Volunteering opportunities to help external communities
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service