Engineering Manager, AI & Data Infrastructure

DecagonSan Francisco, CA
Onsite

About The Position

Decagon is seeking a hands-on Engineering Manager to lead the AI & Data Infrastructure team. This is a deeply technical player/coach role focused on the core systems that power Decagon's AI agents, including data and inference systems. The role involves leading the team responsible for streaming and batch pipelines, real-time databases, and GPU/model-serving platforms for LLM inference. The manager will stay involved in code and system design, participate in incident response, and contribute directly when needed. A key aspect of the role is leading by example in AI-assisted engineering, setting standards for using AI coding tools to improve efficiency and quality. The manager will also be responsible for hiring and developing a high-performing team, partnering with various internal teams to ensure fast and safe shipping of capabilities across different deployment environments (cloud, single-tenant, on-prem). Success requires strong people leadership, execution across concurrent commitments, and technical depth for architectural decisions.

Requirements

  • 2+ years of engineering management experience leading high-performing data, ML, or infrastructure teams, with a strong IC background before that.
  • Deep technical depth in streaming/batch processing, analytical databases, or model-serving — you're comfortable dropping into the codebase and shipping a PR.
  • Hands-on experience operating large-scale data systems (Kafka, ClickHouse/Snowflake/BigQuery, Postgres at scale) and/or production model-serving infrastructure on GPUs.
  • Familiarity with cloud platforms (AWS, GCP, or Azure), Kubernetes, and infrastructure-as-code.
  • A track record of delivering multi-quarter data or ML infrastructure initiatives through ambiguity.
  • A strong point of view on AI-assisted engineering — you use the tools yourself and have opinions on where they work.
  • Care deeply about engineering craft, operational excellence, and cost discipline.

Nice To Haves

  • Experience operating LLM inference infrastructure in production — GPU capacity planning, multi-provider routing, and inference evals.
  • Experience with realtime analytics engines (ClickHouse, Pinot, Druid) and CDC pipelines at scale.
  • Experience delivering data and ML systems into single-tenant, on-prem, or air-gapped enterprise environments.
  • Experience building internal tooling or agents that use LLMs to accelerate engineering work.
  • Background in security and compliance frameworks (SOC 2, PCI DSS, FedRAMP, or similar).

Responsibilities

  • Build, lead, and develop a high-performing team of data and ML infrastructure engineers, including hiring, coaching, and performance management.
  • Own the technical strategy and roadmap for Decagon's AI & Data Infrastructure — streaming/batch data, realtime databases, and the GPU and model-serving stack powering LLM inference.
  • Stay hands-on: review designs and PRs with depth, lead architecture for hard problems, and contribute code when the team needs it.
  • Drive architecture for high-throughput data systems and low-latency inference, including multi-provider LLM routing and CDC pipelines at scale.
  • Set reliability, quality, and cost standards — data freshness SLOs, inference latency and availability, GPU and analytical cost discipline — and build an operating cadence that keeps the platform healthy as we scale.
  • Invest in developer and analyst experience — paved paths for producing and consuming data, and evals and observability for inference.
  • Raise the bar on AI-assisted engineering: define how your team uses AI coding tools to ship faster with higher quality, and build the workflows and guardrails that make this durable.
  • Partner with Research, Product Engineering, Platform, and customer-facing teams to deliver data and inference capabilities on aggressive timelines, including for enterprise deployments.

Benefits

  • Take what you need vacation policy
  • Medical, Dental, and Vision benefits for you and your family
  • Life Insurance and Disability Benefits
  • Retirement Plan (e.g., 401K, pension)
  • Parental Leave
  • Fertility and family building benefits through Carrot
  • Daily lunches and snacks in the office
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service