Health Catalyst-posted 12 days ago
Full-time • Mid Level
Remote
1,001-5,000 employees

As a DevOps / Site Reliability Engineer, you’ll help shape and sustain the infrastructure behind Armus, a core Health Catalyst platform that drives outcomes for clinicians and patients across the country. You’ll work closely with software engineers and product teams to design, automate, and operate the cloud environments that power our clinical registries and analytics solutions. This is a high-visibility, high-expectation role for someone who thrives on accountability, loves solving complex system problems, and wants to grow within a cross-product SRE group that spans multiple technologies and teams. You’ll ship improvements weekly, automate relentlessly, and end each day knowing your work improves healthcare outcomes. If You Love Building reliable, scalable systems that stay up and perform under pressure Taking a half-defined problem and driving it to a clean, measurable solution Balancing speed and safety through automation, testing, and disciplined process Mentoring others, reviewing code, and strengthening DevOps culture Working across application, infrastructure, and security boundaries to make systems better every week Then this role will fit you perfectly.

  • Design, implement, and operate scalable, secure, and resilient infrastructure on Google Cloud Platform (GCP), with a heavy focus on Google Kubernetes Engine (GKE)
  • Apply best practices in container orchestration, networking, IAM, and workload identity
  • Lead cloud cost optimization, capacity planning, and efficient scaling initiatives
  • Manage infrastructure as code using Terraform or similar tools
  • Build and maintain CI/CD pipelines using Jenkins or GitLab CI/CD
  • Ensure reliable deployment flows across development, staging, and production environments
  • Implement automated checks and rollback mechanisms for safe, repeatable releases
  • Implement and refine observability using Sentry, Sumo Logic, and GCP Cloud Monitoring and Logging
  • Participate in the on-call rotation, respond quickly to operational issues, and drive long-term fixes
  • Collaborate with customer success and support teams to quantify and resolve production impact
  • Identify reliability risks early, automate detection and recovery, and reduce manual toil
  • Apply and maintain least-privilege IAM policies and secure configuration baselines
  • Partner with InfoSec to remediate vulnerabilities and support HIPAA and SOC II audit readiness
  • Contribute to incident response readiness and disaster recovery testing
  • Engage with the cross-product SRE squad to learn and contribute across multiple Health Catalyst platforms
  • Help standardize SRE best practices, tooling, and documentation
  • Mentor teammates and continuously raise the bar for reliability and automation
  • Maintain compliance with training directives required by the organization pertaining to Information Security, Acceptable Use Policy and HIPAA Privacy and Security.
  • Adhere to and comply with the organizations Acceptable Use Policy.
  • Safeguard information system assets by identifying and reporting potential and actual security events to the organizations Security and Compliance Officers.
  • 5–7 years of experience in DevOps, SRE, or Cloud Infrastructure Engineering
  • Deep expertise in GCP, especially GKE
  • Experience with other major clouds (AWS or Azure) is a plus
  • Strong working knowledge of Kubernetes and containerized deployments
  • Proven experience with CI/CD tools such as Jenkins or GitLab
  • Scripting experience in Python, Bash, or similar languages
  • Solid understanding of networking, security, and performance fundamentals
  • Hands-on experience with cloud cost management and optimization
  • Calm under pressure with strong troubleshooting and communication skills
  • Exposure to healthcare data or interoperability standards such as FHIR, HL7, or CDA
  • Familiarity with healthcare security and compliance frameworks like HIPAA and SOC II
  • Experience in Agile or Scrum software development environments
  • Background supporting SaaS or multi-tenant systems
  • Health Catalyst has been named as one of the 30 Best Workplaces in Technology by Fortune Magazine and a winner of Gallup Great Workplace award.
  • Health Catalyst earned the highest overall score in Healthcare BI by KLAS and, for the sixth year in a row, was named to the Best Places to Work in Healthcare list by Modern Healthcare.
  • We offer best in class benefits that encourage ownership and inclusion: mentoring and sponsorship programs, remote-work friendliness, career development, company equity, and flexible PTO.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service