Senior MLOps Engineer

Clariti Cloud Inc.
7d$190,000 - $230,000Remote

About The Position

Join our mission to provide governments with exceptional experiences so they can do the same for their communities! What do we do?💥 We empower governments to deliver exceptional citizen experiences. Check out our ‘About Us’ page for a deep dive into our product and what makes us exceptional. How will you help us make an impact? 👩‍💻👨‍💻 The Senior MLOps Engineer will design, build, and scale the systems that power CivCheck and Clariti’s AI capabilities. As the first MLOps Engineer, you will lead the development of robust ML infrastructure, ensuring that models move efficiently from research to production with reliability, observability, and performance at scale.This role is ideal for someone who thrives at the intersection of machine learning, software engineering, and cloud infrastructure, and who’s motivated to enable teams to deliver high-impact ML systems efficiently and safely.

Requirements

  • 6–10+ years of experience in software or ML engineering, with at least 3+ in MLOps or ML infrastructure.
  • Solid experience working with Python, C, C++, Bash, etc.
  • Proven experience deploying and managing ML models in production.
  • Proficiency with Docker, and Kubernetes for scalable ML system design.
  • Experience with cloud platforms (AWS, GCP, or Azure) and GPU orchestration.
  • Hands-on knowledge of CI/CD pipelines (GitHub Actions, Jenkins, or similar).
  • Familiarity with MLflow, Weights & Biases, Kubeflow, and other similar tools for experiment tracking and pipeline automation.
  • Solid understanding of data versioning, model reproducibility, and monitoring strategies.
  • Excellent problem-solving skills and a collaborative, team-oriented mindset.

Nice To Haves

  • Experience training models from scratch, including defining architectures, curating & cleaning datasets, tuning training parameters, and bringing models from research to monitored production.
  • Exposure to model optimization techniques (quantization, distillation, TensorRT, ONNX).
  • Familiarity with infrastructure-as-code tools (Terraform, CloudFormation).
  • Background in distributed systems or high-performance computing.
  • Contributions to open-source projects

Responsibilities

  • Design and maintain end-to-end ML pipelines for training, evaluation, and deployment of models and agentic AI workflows
  • Build and optimize infrastructure for distributed training and model serving across GPU and cloud environments.
  • Develop tools for data creation, model versioning, experiment & performance tracking, and automated retraining.
  • Collaborate with AI researchers and ML engineers to productionize POCs and ensure model reproducibility and scalability.
  • Implement CI/CD best practices for ML systems, including continuous integration, automated testing, and deployment workflows.
  • Monitor and manage model health, performance, drift, and data quality in production.
  • Partner with Engineering teams to streamline infrastructure provisioning and data access.
  • Drive cost optimization and performance tuning for large-scale model training.
  • Contribute to internal documentation and best practices.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

101-250 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service