About The Position

At Komodo Health, our mission is to reduce the global burden of disease. We believe that smarter use of data is essential to this mission, which is why we built the Healthcare Map — the industry’s largest, most complete, precise view of the U.S. healthcare system. This map combines de-identified, real-world patient data with innovative algorithms and decades of clinical experience, serving as the foundation for a powerful suite of software applications. These applications help answer healthcare’s most complex questions for partners across the ecosystem, enabling them to unlock critical insights, track patient behaviors and treatment patterns, identify gaps in care, address unmet patient needs, and ultimately reduce the global burden of disease. Komodo Health values include being awesome, seeking growth, delivering “wow,” and enjoying the ride, fostering a team of ambitious, supportive individuals passionate about their mission. The Infrastructure, SRE & Security team forms the foundation of Komodo's AI platform, owning the cloud, data, and security infrastructure that enables Product and Engineering to deliver with velocity and confidence. This includes AI agent runtimes, platform services, customer-facing SaaS products, and the data pipelines powering the Healthcare Map. Komodo Health is seeking a hands-on Head of Infrastructure, SRE & Security to lead a centralized, transversal team during its transformation into an AI-native platform company. This role will encompass four key domains: Cloud Infrastructure, Data Infrastructure, Security Engineering, and Platform/Shared Services, requiring close partnership with AI Engineering, Platform Engineering, and Product leadership. The goal is to build the infrastructure necessary for Komodo to become a 100% AI company.

Requirements

  • 8+ years in infrastructure, SRE, or platform engineering
  • 3+ years leading teams in an AI/ML-intensive environment
  • Hands-on experience with AI workload infrastructure — LLM serving, agent orchestration, GPU compute, or ML pipelines — and the reliability and cost challenges they introduce
  • Deep AWS and production Kubernetes expertise (EKS, autoscaling, multi-cluster management) and strong IaC discipline (Terraform or equivalent)
  • Demonstrated track record of driving significant cloud cost reduction through systematic FinOps — team-level budgets, cost-per-unit metrics, and leadership-facing dashboards
  • Practical security and compliance experience — cloud posture management, vulnerability lifecycle, IAM, and SOC 2 or equivalent frameworks; comfort in regulated environments
  • Strong executive communication skills — able to translate infrastructure strategy into business outcomes for CTO, Finance, Legal, and Product stakeholders
  • Active user of AI tools in your own workflow; track record of driving AI-assisted automation adoption within your teams

Nice To Haves

  • Snowflake administration and data infrastructure experience at scale
  • Multi-cloud environment experience (AWS + GCP)
  • Healthcare, life sciences, or regulated industry background
  • Experience with security automation or agentic security workflows
  • Familiarity with data pipeline technologies (Spark, Airflow, Temporal)
  • Experience supporting multi-tenant SaaS infrastructure

Responsibilities

  • Own the architecture, reliability, and cost efficiency of Komodo's cloud infrastructure (AWS primary, GCP); drive full IaC coverage and lead Kubernetes operations at scale
  • Own data infrastructure operations, cost governance, and security hardening; partner with Data Product Engineering on modernizing data delivery infrastructure
  • Lead security posture management across cloud, application, and identity layers — vulnerability lifecycle, penetration testing, IAM, SOC 2 compliance, and AI security governance
  • Define and instrument cost-per-unit metrics, implement per-team budgets with automated alerting, and give leadership direct visibility into infrastructure efficiency
  • Operate internal developer platforms with self-service onboarding, CI/CD, and observability infrastructure that improves engineering velocity
  • Own incident response, on-call rotations, and post-mortem processes; drive reduction in preventable operational incidents and maintain high availability SLAs
  • Lead, recruit, and grow a globally distributed team of cloud, data, and security engineers; foster a culture of ownership and technical excellence

Benefits

  • comprehensive health, dental, and vision insurance
  • flexible time off and holidays
  • 401(k) with company match
  • disability insurance
  • life insurance
  • leaves of absence in accordance with applicable state and local laws and regulations and company policy
  • performance-based bonuses
  • equity awards
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service