Senior MLOps Engineer

Bonsai Robotics•San Jose, CA

69d

About The Position

We’re looking for an MLOps engineer who thrives in real-world robotics environments and can own the entire machine learning lifecycle—from data ingestion and labeling to training, evaluation, and performance monitoring. You’ll support a perception stack spanning 2D / 3D object detection, semantic and instance segmentation, depth estimation, and multi-sensor fusion across camera and lidar. This role is deeply cross-functional: you’ll work with perception engineers, autonomy engineers, field operations, and external labeling teams. The work is fast, tangible, and impacts every vehicle that goes into the field.

Requirements

4–7+ years industry experience in MLOps, ML infrastructure, data engineering or applied ML engineering
Strong Python development skills.
Experience building robust data pipelines for large-scale vision or lidar datasets.
Experience managing and operating cloud infrastructure (e.g., AWS EC2, S3, IAM, autoscaling, spot fleets).
Familiarity with ML lifecycle tooling (MLflow, Weights & Biases, Metaflow, Airflow, Ray, etc.).
Experience managing labeling workflows or working directly with annotation vendors.
Strong debugging instincts across the full stack—from data issues to training failures to evaluation anomalies.

Nice To Haves

Experience with PyTorch, CUDA, and common CV/3D libraries.
Experience with multi-sensor fusion, BEV architectures, or 3D perception.
Familiarity with MCAP, ROS2, Foxglove, and real-time robotics systems.
Experience with autonomous vehicle pipelines or industrial/agricultural robotics.
Background in active learning or automated label-quality scoring.
Experience building synthetic data augmentations or simulator-driven dataset expansion.
Experience building auto-labeling pipelines

Responsibilities

Build and maintain scalable data pipelines for 2D/3D detection, segmentation, instance segmentation, and depth estimation
Develop data workflows across multi-camera systems and lidar stored in MCAP format
Own dataset versioning, metadata tracking, and reproducibility systems.
Improve training throughput using distributed systems (Ray, PyTorch Lightning, custom launchers).
Optimize data formats and loaders for large-scale vision and lidar datasets.
Build automated tools for dataset selection, active learning, hard-sample mining, and outlier detection.
Maintain dashboards and automated checks for dataset health, label quality, class balance, and environment coverage.
Partner with field teams to prioritize data collection runs and close the loop between field issues and dataset refreshes.
Manage internal labelers and external labeling vendors.
Define annotation standards for camera and lidar tasks.
Build QA workflows, reviewer interfaces, and automated label-consistency checks.
Identify systematic labeling errors and drive corrective processes.
Build pipelines for continuous evaluation using telemetry from vehicles in the field.
Monitor model drift, identify edge cases, and manage regression tests across “golden” datasets.
Track on-vehicle performance signals to flag data needs, degradations, or unexpected behavior.
Work closely with perception engineers on calibration, sensor models, data schemas, and on-vehicle inference constraints.
Coordinate with autonomy and perception teams to align ML outputs with navigation needs.
Work with platform team to integrate ML pipelines into core platform infrastructure
Partner with fleet operations to review real-world performance and prioritize new data collection.