Senior Manager, SRE & Networking

F5San Jose, CA
20hHybrid

About The Position

At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation. Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. About the Role We are seeking a highly experienced Senior Manager to lead our Platform SRE, Virtualization, Networking, and AI Infrastructure organizations. This leader will oversee teams operating mission-critical infrastructure across: Kubernetes platforms: OpenShift, Titan‑k8s, Robin, Vanilla Kubernetes Virtualization & hypervisors: Proxmox, VMware, XCP‑ng, KVM Private cloud platforms: OpenStack Networking: Data center & cloud networking, L4/L7 services, Kubernetes CNI/service mesh AI/GPU compute: BNK‑AI‑LAB & TMOS AI Lab clusters This role is responsible for multi-team leadership, strategic platform roadmap, operational excellence, and end‑to‑end reliability across hybrid compute environments (VMs, containers, and AI workloads). You will partner closely with Engineering, Cloud, Security, and Architecture leaders to deliver reliable, scalable, and developer-friendly platforms.

Requirements

  • 10+ years infrastructure/SRE/platform engineering experience
  • 5+ years managing engineering teams (including managers or tech leads)
  • Deep experience with Kubernetes, virtualization, and cloud/networking
  • Strong leadership, communication, and cross-functional alignment
  • Proven record of accomplishment improving platform uptime, performance, and reliability
  • 10+ years of experience in SRE, Platform Engineering, Virtualization, Networking, or Infrastructure.
  • 5+ years managing engineering teams.
  • Proven leadership in: Kubernetes platforms (OpenShift, Titan‑k8s, Robin, Vanilla K8s)
  • Virtualization (Proxmox, VMware, XCP‑ng, KVM)
  • OpenStack (Nova, Neutron, Cinder, Keystone)
  • Data center/cloud networking and distributed systems
  • Strong executive communication skills and cross-org influencing ability.
  • Demonstrated experience improving operational maturity and reliability for large-scale systems.
  • Strong background in automation, CI/CD, observability, and infrastructure architecture.

Nice To Haves

  • Experience running large-scale multi-cluster Kubernetes environments.
  • Experience with service mesh, ingress controllers, and network policy frameworks.
  • Familiarity with GPU scheduling, Ray, Kubeflow, MLflow, or Triton Inference Server.
  • Experience with storage backends (Ceph, vSAN, ZFS, CSI-based solutions).
  • Experience driving multi-year infrastructure transformation programs.
  • Expertise in GitOps and IaC (Terraform, Ansible, Pulumi).

Responsibilities

  • Multi-team ownership: SRE, Networking, Virtualization, AI/GPU Infrastructure
  • Lead hybrid data centers — spanning routing, switching, firewalls, SDN/overlay, Kubernetes CNI, and service‑mesh/L4‑L7 traffic — to drive network reliability, performance, security, and automation.
  • Reliability strategy: SLO/SLI programs, incident management, automation, scaling
  • Kubernetes platform operations across multiple distros
  • Virtualization & private cloud: Proxmox, VMware, XCP-ng, KVM, OpenStack
  • Provide executive oversight for OpenStack compute storage, and networking services. Ensure scalable VM lifecycle management, resource optimization, and operational maturity.
  • Networking: datacenter/cloud networking, CNI, service mesh, L4/L7 traffic
  • Own end‑to‑end reliability and performance of AI compute platforms, including model training/inference pipelines, GPU scheduling and autoscaling, and high‑performance compute environments
  • Partner with ML, Data, and Product to build next-gen AI compute platforms. Drive adoption of automation-first operations, GitOps, and infrastructure-as-code.
  • Own the multi‑year platform roadmap across hybrid compute, Kubernetes, virtualization, AI, and networking while driving cross‑org alignment and leading large‑scale modernization across CI/CD, observability, and infrastructure.
  • Drive organizational strategy, prioritization, staffing plans, hiring, and budgeting.
  • Build a high-performance, inclusive culture focused on ownership, excellence, and continuous improvement.

Benefits

  • You may also be offered incentive compensation, bonus, restricted stock units, and benefits.
  • More details about F5’s benefits can be found at the following link: https://www.f5.com/company/careers/benefits.
  • F5 reserves the right to change or terminate any benefit plan without notice.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service