ML & Cloud Infrastructure Engineer

Gritt RoboticsBelmont, CA
Onsite

About The Position

Gritt is developing physical AI to automate the construction of large-scale infrastructure around the globe. Gritt’s systems are already deployed commercially in difficult outdoor environments, and are helping to build critical energy infrastructure. The founding team comprises experts in robotics and AI from Carnegie Mellon, Stanford and MIT. Gritt is a Series A company backed by marquee VCs. This role is for a Software - ML & Cloud Infrastructure Engineer located in the SF Bay Area (in-person). We’re looking for an experienced ML & Cloud Infrastructure Engineer to join our team. As an early member, you will play a pivotal role in architecting scalable cloud infrastructure for our AI and data pipelines. You'll need to thrive in a fast-paced startup environment where you'll wear multiple hats and have a direct impact on our product's evolution. Ideally, you have a proven track record of developing and deploying high-performance ML and cloud pipelines in production, and you're passionate about pushing the boundaries of what's possible in robotics with AI.

Requirements

  • Degree in computer science or related engineering disciplines (or equivalent experience).
  • 4+ years of experience deploying high-performance ML pipelines in production.
  • Proficient in Python and comfortable with C++/Go.
  • Experience with ML frameworks like PyTorch.
  • Experience with IO and data-loading workflows, including formats like Parquet, HDF5, TFRecord etc.
  • Experience with deploying on cloud platforms like AWS, GCP or Azure.
  • Experience with tooling like Docker, Kubernetes, and Airflow.
  • Should be comfortable taking ownership of tasks with light supervision.
  • Must have excellent problem-solving skills.
  • Legally authorized to work in the United States.

Responsibilities

  • Develop and deploy scalable AI training and validation pipelines in the cloud.
  • Spin up distributed pipelines for data ingestion, pre-processing, training and evaluation.
  • Deploy monitoring and CI/CD pipelines.
  • Enable large-scale evaluation of AI models via cloud-based metrics.
  • Enable large-scale evaluation of autonomy software and models via simulations in the cloud.
  • Optimize performance, I/O and GPU utilization.
  • Build tooling and dashboards for rapid experimentation, orchestration and visualization.
  • Work with other teams to integrate cloud tooling into workflows.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service