Staff Data Engineer, AI & Robotics

General MotorsWarren, MI
Onsite

About The Position

The Staff Data Engineer, AI and Robotics will join the AI Research team within the Autonomous Robotics Center (ARC). This role sets the technical direction for the robotics data backbone that enables scalable robot learning in manufacturing — from data capture and curation through versioning, serving, and auditing. Your work will make model development reproducible, testable, and production-ready, while establishing the infrastructure standards and operating patterns that accelerate robotics AI across programs. This is a senior technical leadership role in robotics and machine learning infrastructure, focused on multimodal robotic datasets and continuous model iteration. You will work across AI research, robotics engineering, manufacturing, and validation teams to turn real-world robot behavior and failures into high-quality training data, robust production systems, and durable platform capabilities used broadly across the organization.

Requirements

  • B.S. or M.S. in Computer Science, Computer Engineering, Data Engineering, or a related field.
  • 8+ years of experience building production data systems and/or ML infrastructure, including practical experience supporting training pipelines end-to-end.
  • Strong proficiency in Python and at least one of: C++, Scala, or Java.
  • Demonstrated engineering discipline in testing, documentation, system design, and operational reliability.
  • Experience with dataset versioning, lineage, and reproducibility tooling such as DVC or equivalent approaches.
  • Experience with experiment tracking and model registry patterns such as MLflow or equivalent tools.
  • Experience designing technical systems that support multiple stakeholders and use cases, with the ability to influence architecture beyond an individual project.
  • Ability to work onsite with hardware and robotics teams, and to design pipelines that handle real-world robotic logging constraints such as bandwidth limits, dropped frames, and timing drift.

Nice To Haves

  • Hands-on robotics logging and replay experience, including ROS 2 bags and system telemetry pipelines.
  • Experience with simulation-to-real data workflows and dataset synthesis strategies.
  • Familiarity with data governance requirements and auditability in safety-adjacent or safety-critical systems.
  • Experience building tools to support data labeling workflows, quality assurance, and active learning loops.
  • Experience serving as a technical lead, setting engineering standards, and mentoring senior or mid-level engineers across complex initiatives.

Responsibilities

  • Define and drive the technical vision for multimodal robotics data infrastructure spanning vision, depth, force/torque, joint states, events, and metadata across lab and plant-adjacent environments.
  • Architect and scale reliable data capture, ingestion, and serving pipelines that support robot learning workflows from experimentation through production deployment.
  • Establish reproducible data logging and replay frameworks, including ROS 2 bagging where applicable, to enable debugging, regression testing, root-cause analysis, and dataset creation at scale.
  • Own the strategy for dataset lifecycle management, including versioning, lineage, provenance, governance, retention, and quality gates, to support trustworthy model training and evaluation.
  • Lead the integration of experiment tracking, model/data traceability, and auditability patterns so teams can compare runs, reproduce results, and understand system changes over time.
  • Design and implement MLOps automation patterns, including CI/CD/CT-style pipelines for ML systems, that reduce manual effort and improve deployment confidence for robotics AI updates.
  • Partner with AI/ML, planning, validation, and plant teams to define data contracts such as schemas, labeling standards, and failure taxonomies, and convert field failures into curated training datasets and measurable learning loops.
  • Influence architecture across adjacent systems and mentor engineers on best practices in data engineering, ML infrastructure, observability, and production reliability.
  • Drive cross-functional technical decisions, balancing research velocity with platform robustness, governance, and long-term maintainability.

Benefits

  • From day one, we're looking out for your well-being–at work and at home–so you can focus on realizing your ambitions. Learn how GM supports a rewarding career that rewards you personally by visiting Total Rewards resources.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service