Sr. Director, Infrastructure, SRE, & Security

Komodo Health

49d•Hybrid

About The Position

At Komodo Health, our mission is to reduce the global burden of disease. We believe that smarter use of data is essential to this mission, which is why we built the Healthcare Map — the industry’s largest, most complete, precise view of the U.S. healthcare system. This map combines de-identified, real-world patient data with innovative algorithms and decades of clinical experience, serving as the foundation for a powerful suite of software applications. These applications help answer healthcare’s most complex questions for partners across the ecosystem, enabling them to unlock critical insights, track patient behaviors and treatment patterns, identify gaps in care, address unmet patient needs, and ultimately reduce the global burden of disease. Komodo Health values include being awesome, seeking growth, delivering “wow,” and enjoying the ride, fostering a team of ambitious, supportive individuals passionate about their mission. The Infrastructure, SRE & Security team forms the foundation of Komodo's AI platform, owning the cloud, data, and security infrastructure that enables Product and Engineering to deliver with velocity and confidence. This includes AI agent runtimes, platform services, customer-facing SaaS products, and the data pipelines powering the Healthcare Map. Komodo Health is seeking a hands-on Head of Infrastructure, SRE & Security to lead a centralized, transversal team during its transformation into an AI-native platform company. This role will encompass four key domains: Cloud Infrastructure, Data Infrastructure, Security Engineering, and Platform/Shared Services, requiring close partnership with AI Engineering, Platform Engineering, and Product leadership. The goal is to build the infrastructure necessary for Komodo to become a 100% AI company.

Requirements

8+ years in infrastructure, SRE, or platform engineering
3+ years leading teams in an AI/ML-intensive environment
Hands-on experience with AI workload infrastructure — LLM serving, agent orchestration, GPU compute, or ML pipelines — and the reliability and cost challenges they introduce
Deep AWS and production Kubernetes expertise (EKS, autoscaling, multi-cluster management) and strong IaC discipline (Terraform or equivalent)
Demonstrated track record of driving significant cloud cost reduction through systematic FinOps — team-level budgets, cost-per-unit metrics, and leadership-facing dashboards
Practical security and compliance experience — cloud posture management, vulnerability lifecycle, IAM, and SOC 2 or equivalent frameworks; comfort in regulated environments
Strong executive communication skills — able to translate infrastructure strategy into business outcomes for CTO, Finance, Legal, and Product stakeholders
Active user of AI tools in your own workflow; track record of driving AI-assisted automation adoption within your teams

Nice To Haves

Snowflake administration and data infrastructure experience at scale
Multi-cloud environment experience (AWS + GCP)
Healthcare, life sciences, or regulated industry background
Experience with security automation or agentic security workflows
Familiarity with data pipeline technologies (Spark, Airflow, Temporal)
Experience supporting multi-tenant SaaS infrastructure

Responsibilities

Own the architecture, reliability, and cost efficiency of Komodo's cloud infrastructure (AWS primary, GCP); drive full IaC coverage and lead Kubernetes operations at scale
Own data infrastructure operations, cost governance, and security hardening; partner with Data Product Engineering on modernizing data delivery infrastructure
Lead security posture management across cloud, application, and identity layers — vulnerability lifecycle, penetration testing, IAM, SOC 2 compliance, and AI security governance
Define and instrument cost-per-unit metrics, implement per-team budgets with automated alerting, and give leadership direct visibility into infrastructure efficiency
Operate internal developer platforms with self-service onboarding, CI/CD, and observability infrastructure that improves engineering velocity
Own incident response, on-call rotations, and post-mortem processes; drive reduction in preventable operational incidents and maintain high availability SLAs
Lead, recruit, and grow a globally distributed team of cloud, data, and security engineers; foster a culture of ownership and technical excellence

Benefits

comprehensive health, dental, and vision insurance
flexible time off and holidays
401(k) with company match
disability insurance
life insurance
leaves of absence in accordance with applicable state and local laws and regulations and company policy
performance-based bonuses
equity awards

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume