Senior AI Systems Engineer

ArcherSan Jose, CA
Onsite

About The Position

As a Senior AI Systems Engineer, you will architect, deploy, and manage the critical infrastructure services required for large-scale AI model training and inference. You will ensure our machine learning platforms are robust and efficient, bridging the gap between raw data and high-performance AI models.

Requirements

  • BS/MS/PhD degree in Computer Science, Software Engineering, or a related field.
  • 3+ years of professional software engineering experience with a dedicated focus on AI/ML systems, high-performance computing (HPC), or ML infrastructure.
  • Familiarity with hyper-scaler infrastructure (AWS) alongside specialized AI-centric bare-metal and GPU clouds (Nebius AI Cloud).
  • Hands-on experience with containerization (Docker) and production-grade orchestration (Kubernetes), paired with cloud-agnostic cluster abstractors like SkyPilot to manage multi-region GPU availability.
  • Deep architectural understanding of large language models and the system infrastructure required to serve them at scale using frameworks like vLLM and SGLang.
  • Experience building high-throughput data pipelines to support large-scale training, including proficiency in SQL, NoSQL, and columnar storage formats optimized for ML (e.g., Parquet).

Nice To Haves

  • Familiarity with audio processing, speech-to-text frameworks, or Automatic Speech Recognition (ASR) pipelines.
  • Prior experience or a deep technical interest in aerospace, aviation, or autonomous systems (e.g., safety-critical software, edge-AI deployments).

Responsibilities

  • Deploy, scale, and manage resilient infrastructure services tailored for distributed AI model training and low-latency inference.
  • Utilize and maintain end-to-end tooling—including MLflow for experiment tracking and model registry—to streamline and optimize the AI development lifecycle.
  • Leverage specialized frameworks to maximize hardware utilization, managing multi-cloud compute scheduling alongside advanced LLM serving engines.
  • Partner closely with AI researchers and Software Engineers to productionize cutting-edge models, establish monitoring systems, and debug complex performance bottlenecks at the hardware-software interface.

Benefits

  • Base pay between $160,000.00 - $180,000.00
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service