Sr Manager, Site Reliability (SASE)

Palo Alto Networks•Office - USA - CA - Headquarters, CA

2d•Onsite

About The Position

Our Mission At Palo Alto Networks®, we’re united by a shared mission—to protect our digital way of life. We thrive at the intersection of innovation and impact, solving real-world problems with cutting-edge technology and bold thinking. Here, everyone has a voice, and every idea counts. If you’re ready to do the most meaningful work of your career alongside people who are just as passionate as you are, you’re in the right place. Who We Are In order to be the cybersecurity partner of choice, we must trailblaze the path and shape the future of our industry. This is something our employees work at each day and is defined by our values: Disruption, Collaboration, Execution, Integrity, and Inclusion. We weave AI into the fabric of everything we do and use it to augment the impact every individual can have. If you are passionate about solving real-world problems and ideating beside the best and the brightest, we invite you to join us! We believe collaboration thrives in person. That’s why most of our teams work from the office full time, with flexibility when it’s needed. This model supports real-time problem-solving, stronger relationships, and the kind of precision that drives great outcomes. Job Summary Your Career We are looking for a visionary Senior Manager of Site Reliability Engineering to lead our global SRE organization across the US and India. This isn't just a "keep the lights on" role; you will be the primary architect of our AI-driven Autonomous SRE transformation at Palo Alto Networks. You will bridge the gap between infrastructure products and operational excellence, gathering complex requirements from product teams and translating them into automated, intelligent self-service platform capabilities to ensure our systems are not just reliable, but self-healing.

Requirements

10+ years in SRE, Infrastructure or DevOps environments.
5+ years managing global teams of 15+ engineers across multiple time zones.
Deep understanding of Cloud Native ecosystems (Azure/AWS/GCP), Kubernetes and CI/CD pipelines.
Proven track record of implementing ML-driven monitoring (e.g., anomaly detection, automated root cause analysis, event correlation).
Exceptional ability to translate "deep tech" into business value for C-suite stakeholders.
Experience using AI tools like Claude, Gemini or Copilot to build solutions is mandatory.

Responsibilities

Directly manage and scale a high-performing, multi-geographical SRE team (US and India), fostering a culture of psychological safety, continuous learning, and "operational pride."
Standardize SRE practices globally while respecting local nuances, ensuring 24/7 coverage models (Follow-the-Sun) are seamless and burnout-resistant.
Manage the financial aspects of global headcount and cloud infrastructure spend.
Drive the Autonomous SRE Roadmap: Transition the organization from reactive monitoring to proactive, AI-driven observability and incident remediation using machine learning to reduce Mean Time to Recovery (MTTR).
Act as the lead consultant for infrastructure product teams to define what "reliability" looks like for next-gen AI services.
Partner with the Platform Engineering team to build and internalize "Golden Paths" that bake in SLOs, error budgets, and automated canary analysis.
Work hand-in-hand with InfoSec and Compliance to automate guardrails (Policy-as-Code) and ensure global data sovereignty requirements are met.
Influence R&D leadership to prioritize non-functional requirements and technical debt reduction.