ML Ops Engineer

sauron.systems•San Francisco, CA

About The Position

We're looking for an MLOps Engineer who thrives at the intersection of perception systems, infrastructure, and real-world deployment. You'll play a key role in making sure our cutting-edge AI systems can be seamlessly deployed to homes across the country - reliably, securely, and at scale. Your work will span everything from robust ML deployment infrastructure on the edge to networking and observability on real devices in the field. If you've ever wanted to put advanced robotics and AI into the hands of everyday people, this is the place to do it.

Requirements

3-5+ years experience in DevOps, deployment engineering, or site reliability, ideally with production ML systems or robotics.
Deep operational experience with Linux system administration, system packaging (e.g., Deb/RPM), and configuration management tools (e.g., Ansible, SaltStack, Chef).
Strong experience with ML deployment/serving frameworks and infrastructure (e.g., PyTorch Serve, custom C++ inference services).
Comfortable working in Linux-heavy environments with advanced shell scripting and strong knowledge of operating system internals.
Hands-on experience with networking fundamentals, including TCP/IP, firewalls, NAT traversal, and VPNs.
Prior experience with managing large-scale edge fleets, including over-the-air (OTA) updates and blue-green deployment strategies.
A proven track record of developing internal developer tools or CLI applications that automate complex infrastructure tasks and improve overall team productivity.

Nice To Haves

Experience deploying AI/ML inference pipelines on bare-metal or virtualized edge hardware (e.g., using GStreamer/Deepstream pipelines, custom executables).
Expertise in machine learning inference engineering, including quantization and compilation (e.g., using ONNX Runtime, TensorRT), for efficient deployment to various edge hardware targets (e.g., NVIDIA Jetson, custom ARM SoCs).
Familiarity with writing or debugging high-performance, low-latency ML inference services in C++.
Exposure to remote logging, log ingestion, and distributed telemetry aggregation.
Previous experience in early-stage startups or fast-paced hardware/software integration environments.

Responsibilities

Own and evolve the deployment lifecycle for our perception systems across edge and cloud environments.
Design and manage highly available ML serving infrastructure, ensuring high performance, low-latency inference, and reliability in production.
Build resilient CI/CD pipelines for testing and pushing system updates with confidence and comprehensive fleet observability.
Implement and manage remote system monitoring, alerting (e.g., Prometheus, Grafana, Sentry), and debugging systems to ensure operational excellence, focusing on fleet health metrics (e.g., uptime, resource utilization, inference latency).
Work closely with perception and backend teams to design deployable systems that are robust in the real world.
Integrate and maintain experiment tracking and model management platforms (e.g., Weights & Biases, MLflow) to streamline model lineage, performance comparison, and versioning from research to production.
Contribute to security policy design and device authentication/attestation infrastructure for fleet safety.
Build and maintain internal tooling and CLI utilities to streamline the end-to-end development-to-deployment workflow, empowering the broader engineering team to ship perception systems with high velocity and minimal friction.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume