About The Position

Join our team building production ML infrastructure for enterprise-scale machine learning pipelines. You'll work on a platform that orchestrates end-to-end ML workflows from data ingestion through model training, evaluation, and deployment.

Requirements

  • Experience with PyTorch, transformers, or other ML libraries
  • Familiarity with ML model evaluation and experimentation
  • Interest in ML/AI infrastructure and operations
  • Strong problem-solving and debugging skills
  • Comfortable with Linux/command-line environments
  • Knowledge of AWS services (S3, SageMaker, IAM)
  • Exposure to Apache Airflow or workflow orchestration
  • Understanding of CI/CD, testing, or infrastructure-as-code

Responsibilities

  • Build and maintain Apache Airflow DAGs for ML pipeline orchestration
  • Develop SageMaker training jobs for NLP models (NeMo, PyTorch)
  • Implement MLflow tracking and model registry integrations
  • Write infrastructure-as-code using Terraform (AWS S3, IAM, VPC)
  • Create comprehensive tests for ML pipeline components
  • Follow spec-driven development practices with Claude Code
  • Contribute to ML observability and evaluation frameworks
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service