About The Position

Oracle Health AI is building the next generation of intelligent, secure, and resilient healthcare cloud services that improve clinical and operational outcomes for healthcare providers and government agencies. We are seeking an experienced Senior Manager, Site Reliability Engineering to lead a high-performing engineering organization responsible for the reliability, performance, security, automation, and operational excellence of Oracle Health AI services supporting Oracle Health Federal customers. This leader will drive the organization's transformation from traditional operations to a software-defined, AI-assisted, and automation-first operating model. The ideal candidate is an engineering leader with deep experience in cloud-native platforms, Site Reliability Engineering (SRE), DevOps, and AI-enabled operational excellence. They will partner across engineering, product, cloud infrastructure, security, compliance, and customer operations to deliver highly available, secure, and scalable services while fostering a culture of innovation, operational excellence, and continuous improvement.

Requirements

  • Bachelor's degree in Computer Science, Engineering, Information Systems, or a related technical discipline, or equivalent practical experience.
  • 8+ years of experience in software engineering, Site Reliability Engineering (SRE), DevOps, platform engineering, production engineering, cloud operations, or related technical leadership roles supporting large-scale, customer-facing cloud services.
  • 3+ years of experience leading senior software engineers, SREs, DevOps teams, or production engineering organizations responsible for mission-critical production environments.
  • Demonstrated success transforming traditional operations into software-driven, automation-first, and AI-assisted operating models.
  • Proven experience eliminating operational toil through software engineering, intelligent automation, Infrastructure as Code, self-service platforms, AI agents, and exception-based operational workflows.
  • Experience owning production services across the full operational lifecycle, including platform deployment, customer onboarding, reliability engineering, incident response, change management, and ongoing production support.
  • Strong understanding of Site Reliability Engineering principles, including service level objectives (SLOs), service level indicators (SLIs), error budgets, observability, incident management, deployment safety, resilience engineering, disaster recovery, and continuous operational improvement.
  • Practical experience implementing AI-assisted engineering, AI-enabled automation, agentic workflows, intelligent operational tooling, or AI-supported engineering productivity solutions.
  • Demonstrated ability to evaluate AI adoption pragmatically by balancing automation opportunities with security, compliance, governance, deterministic execution, explainability, and appropriate human oversight.
  • Experience operating within regulated, security-sensitive, or high-availability environments, preferably supporting Federal or healthcare customers.
  • Strong executive communication, organizational leadership, and stakeholder management skills with the ability to communicate technical strategy, operational risk, engineering investments, and measurable business outcomes.

Nice To Haves

  • Experience with Oracle Cloud Infrastructure (OCI), Kubernetes, containerized platforms, Infrastructure as Code (Terraform), CI/CD pipelines, and modern observability platforms.
  • Experience supporting FedRAMP, DoD, VA, HIPAA, HITRUST, or other regulated cloud environments.
  • Experience leading large-scale cloud transformation initiatives and implementing SRE best practices across engineering organizations.
  • Proven track record recruiting, mentoring, and developing high-performing engineering leaders while fostering a culture of accountability, operational excellence, innovation, and customer focus.
  • Experience leveraging AI and machine learning technologies to improve engineering productivity, operational efficiency, reliability, and customer experience.

Responsibilities

  • Lead a high-performing engineering organization responsible for the reliability, performance, security, automation, and operational excellence of Oracle Health AI services supporting Oracle Health Federal customers.
  • Drive the organization's transformation from traditional operations to a software-defined, AI-assisted, and automation-first operating model.
  • Partner across engineering, product, cloud infrastructure, security, compliance, and customer operations to deliver highly available, secure, and scalable services.
  • Foster a culture of innovation, operational excellence, and continuous improvement.
  • Eliminate operational toil through software engineering, intelligent automation, Infrastructure as Code, self-service platforms, AI agents, and exception-based operational workflows.
  • Own production services across the full operational lifecycle, including platform deployment, customer onboarding, reliability engineering, incident response, change management, and ongoing production support.
  • Evaluate AI adoption pragmatically by balancing automation opportunities with security, compliance, governance, deterministic execution, explainability, and appropriate human oversight.
  • Communicate technical strategy, operational risk, engineering investments, and measurable business outcomes to executives.
  • Recruit, mentor, and develop high-performing engineering leaders.
  • Foster a culture of accountability, operational excellence, innovation, and customer focus.

Benefits

  • Flexible medical
  • Life insurance
  • Retirement options
  • Volunteer programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service