AI Engineer - Cloud Infrastructure

TraversalNew York, NY
Onsite

About The Position

As an AI Engineer - Cloud Infrastructure on Traversal’s Infrastructure team, you’ll design, secure, and operate the core systems that power Traversal’s AI products. We already serve Fortune 50 enterprises with large-scale, multi-tenant environments, BYOC deployments, and SOC 2 Type II controls, and we’re rapidly scaling. You’ll focus on the building blocks of our Terraform-defined infrastructure and Kubernetes environments, while supporting the complex needs of operating the highly-available, highly-resilient, and cost-efficient platform that supports the Traversal AI SRE agent. This is a senior, high-impact role: you’ll own foundational systems, work across AWS-native infrastructure, cloud networking, Kubernetes environments, Terraform, Helm, Python, and more, shaping how enterprise AI reliability is built and scaled.

Requirements

  • 7+ years of experience at technically rigorous companies or teams
  • Proven experience operating cloud and Kubernetes native infrastructure and applications at scale with >99.9% availability
  • Demonstrated hands-on experience with AWS, EKS, Terraform, Helm
  • Experience designing idempotent systems (outbox, dedupe keys, safe replay)
  • Incident response, chaos testing, capacity planning
  • Strong debugging skills across infrastructure, compute, network, runtime, storage, and auth layers

Nice To Haves

  • Service mesh (Envoy/Istio), Cilium/eBPF
  • GPU workload operations, inference servers, token streaming gateways
  • Production experience building and maintaining systems in Python, Rust, and TypeScript
  • Data governance (PII discovery/redaction), lineage, tokenization
  • Experience designing, implementing, and deploying cross-region active/active architectures
  • Familiarity with other cloud providers (GCP, Azure, Oracle Cloud)

Responsibilities

  • Design scalable, reliable infrastructure for AI workloads, inference, data pipelines, and agentic workflows
  • Build and deliver best-in-class developer experience and software development lifecycle tooling for our growing engineering team
  • Scale on real signals (queue lag, in-flight requests, latency); add burst capacity and safe drains
  • Evolve Terraform+Helm for multi-environment deployments, secrets, policy-as-code, and workload identity
  • Build and deliver end-to-end visibility into our infrastructure, systems, and applications, and connect it to Traversal’s AI SRE agent for self-driving production
  • Partner with our cloud security lead to improve Traversal’s security and compliance posture, implementing least privilege principles, JIT access workflows, default-deny egress, auditability, and policy-as-code

Benefits

  • health insurance
  • startup equity
  • great tech setup
  • flexible time off
  • in-office snacks
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service