ML Infra Engineer (Data)

Rhoda aiPalo Alto, CA

About The Position

At Rhoda AI, we're building the full-stack foundation for the next generation of humanoid robots — from high-performance, software-defined hardware to the foundational models and video world models that control it. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling scenarios unseen in training. We work at the intersection of large-scale learning, robotics, and systems, with a research team that includes researchers from Stanford, Berkeley, Harvard, and beyond. We're not building a feature; we're building a new computing platform for physical work — and with over $400M raised, we're investing aggressively in the R&D, hardware development, and manufacturing scale-up to make that a reality. We're looking for a Senior ML & Data Infrastructure Engineer to own and scale the systems that power our model training data pipeline — from raw ingestion and storage to indexing, retrieval, and throughput optimization at massive scale.

Requirements

  • 5+ years of experience in data infrastructure, distributed systems, ML infrastructure, or a closely related field
  • Strong experience building and operating large-scale data pipelines (1B+ samples or petabyte-scale systems preferred)
  • Deep understanding of distributed systems, databases, indexing strategies, and cloud storage architectures
  • Experience optimizing data throughput, workload balancing, and cost-performance tradeoffs in cloud environments
  • Strong skills in observability, monitoring, and production reliability for high-scale systems
  • Strong software engineering fundamentals with the ability to own systems end-to-end, from design to production

Nice To Haves

  • Experience managing large multimodal datasets
  • Familiarity with ML training workflows and data lifecycle management
  • Familiarity with vision-language models (VLMs) and experience running ML inference workloads at scale in distributed or cloud environments
  • Experience with robotics data formats or real-world sensor data (video, proprioception, teleoperation logs)
  • Familiarity with data versioning and lineage tooling (e.g., DVC, Delta Lake, or similar)

Responsibilities

  • Architect, build, and scale a high-throughput data infrastructure that processes and manages billions of video clips with strong guarantees around reliability, latency, and cost efficiency
  • Design and optimize large-scale storage systems (cloud object storage, databases, metadata stores) for multimodal datasets
  • Build efficient indexing and retrieval systems to support fast dataset querying, filtering, and iteration for research and production use cases
  • Develop observability frameworks for data pipelines including monitoring, alerting, failure recovery, and performance optimization
  • Implement intelligent workload balancing and throughput optimization across distributed compute and storage systems
  • Manage data artifacts, versioning, and lineage to ensure reproducibility and traceability across training runs
  • Build internal interfaces and lightweight tools that enable researchers and engineers to explore, query, and analyze large datasets at scale
  • Support integration and scalable deployment of vision-language models (VLMs) within data pipelines for screening, enrichment, or metadata generation

Benefits

  • Own the data foundation that everything else runs on — model quality is only as good as the data infrastructure beneath it
  • Direct collaboration with research and ML systems teams; your work has immediate, measurable impact on training velocity
  • High ownership in a small team — you'll make real architectural decisions, not execute tickets
  • Help build the infrastructure that powers robots operating in the real world, at scale
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service