About The Position

At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation. Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Senior Site Reliability Engineer (Platform Focus) About the Role We’re looking for a Senior Site Reliability Engineer with a strong platform engineering mindset to design, build, and operate the foundational systems that power our products. This role blends deep infrastructure expertise with software engineering discipline to create scalable, resilient, and developer‑friendly platforms. You’ll partner closely with engineering teams to evolve our platform architecture, improve reliability, and accelerate delivery through automation, observability, and thoughtful system design.

Requirements

  • 8+ years in SRE, DevOps, or platform engineering with hands‑on ownership of production systems.
  • Expertise in Kubernetes and container orchestration at scale.
  • Proficiency with IaC tools such as Terraform, Ansible, and CloudFormation.
  • Solid programming skills in languages such as Go, Python, or Bash.
  • Deep understanding of distributed systems, networking, and Linux internals.
  • Experience building CI/CD pipelines using tools like Gitlab runners, GitHub Actions, or Jenkins.
  • Strong observability background with Prometheus, Grafana, Open Telemetry, or similar.
  • Proven track record of incident management and improving system reliability.

Nice To Haves

  • Experience designing internal developer platforms or platform‑as‑a‑product models.
  • Experience in OpenShift, Titan‑k8s and Robin.
  • Knowledge of event‑driven architectures and streaming systems like Kafka.
  • Security engineering familiarity including secrets management and compliance frameworks.
  • Contributions to open‑source projects in the cloud‑native ecosystem.
  • Strong experience with cloud platforms such as AWS, Azure, or GCP.

Responsibilities

  • Build and evolve multiple distributions of Kubernetes platform.
  • Build automation and tooling to streamline deployments, configuration, and environment management.
  • Drive reliability practices such as SLOs, error budgets, incident responses, and post‑incident reviews.
  • Develop golden paths for service onboarding, CI/CD, and platform usage across all K8s variants.
  • Implement observability systems including metrics, logging, tracing, and alerting.
  • Collaborate with product and engineering teams to ensure platform capabilities meet evolving needs.
  • Optimize performance and capacity across compute, storage, and networking layers.
  • Champion infrastructure-as-code and modern cloud‑native patterns.
  • Drive automation-first operations using IaC and GitOps.
  • Lead incident response, RCA, post-incident learning, and improve on-call health.
  • Partner with security teams to enforce platform guardrails, policy, and secure defaults.
  • Lead complex troubleshooting efforts across distributed systems and production environments.
  • Mentor engineers and contribute to a culture of operational excellence.

Benefits

  • You may also be offered incentive compensation, bonus, restricted stock units, and benefits.
  • More details about F5’s benefits can be found at the following link: https://www.f5.com/company/careers/benefits.
  • F5 reserves the right to change or terminate any benefit plan without notice.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service