Principal Site Reliability Engineer (Prisma Access)

Palo Alto Networks•Santa Clara, CA

80d•$147,000 - $235,000

About The Position

As a Principal SRE at Palo Alto Networks, you will be at the forefront of building and maintaining highly reliable, scalable, and secure cloud infrastructure within a FedRAMP compliant environment. This role requires US Citizenship. You will drive operational excellence, champion SRE best practices, and work collaboratively to ensure our systems are robust and performant. This includes automation, architecture, performance, observability, troubleshooting, security, and reliability. The Infrastructure Platform stack includes Terraform, Kubernetes, GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Backstage, MySQL, PagerDuty, FireHydrant, Python, Bash, Java, NodeJS, and Go.

Requirements

Must be a US Citizen to be considered.
7+ years of experience in Infrastructure, SRE, or DevOps roles.
BS or MS in Computer Science, a related field, or equivalent professional experience.
4+ years of experience with AWS and GCP, and expertise in their architecture, services and PKI concepts for cloud security.
Expert troubleshooting skills to resolve cloud infrastructure and service issues, effectively identifying root cause and devising effective solutions.
Proficiency in automation using Python and shell scripting; Golang is a plus.
Expertise in Infrastructure as Code (IaC) with Terraform and Helm, leveraging AI tools for development.
Solid experience with Kubernetes, container networking, and container workloads.
Strong Linux administration skills.
Proficiency with CI/CD pipelines, GitOps principles, and tooling like GitLab and Jenkins.
Excellent written and verbal communication skills, with the ability to collaborate effectively to drive outcomes.
Self-disciplined, self-managed, and highly driven with a strong sense of ownership and urgency.
Ability to adapt quickly to evolving cloud technologies, security threats, and advancements through continuous learning.
Effectively address customer needs and provide clear Root Cause Analysis (RCA) to customers.
A deep understanding of how technical decisions impact the business and an ability to align cloud operations with business goals.

Responsibilities

Design, build, and operate reliable, secure Cloud infrastructure across multi-cloud environments for our federal customers.
Lead cross-functional initiatives to ensure applications are production-ready, scalable, secure, and resilient.
Develop expertise in new technologies, embracing continuous learning and the adoption of AI tools.
Develop tools and automation frameworks, championing Infrastructure as Code (IaC) and Monitoring as Code (MaC) principles.
Automate robust deployments and orchestrate end-to-end monitoring and alerting solutions.
Participate in on-call rotations to support critical business and production systems.
Lead root cause analysis of critical issues, driving improvements and preventing recurrence.
Champion the success of SRE and DevOps initiatives, aligning technical decisions with business goals.

Benefits

Compensation offered for this position will depend on qualifications, experience, and work location.
Starting base salary expected to be between $147,000 - $235,000/YR.
Offered compensation may also include restricted stock units and a bonus.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Senior

Industry

Professional, Scientific, and Technical Services

Education Level

Bachelor's degree

Number of Employees

5,001-10,000 employees

Principal Site Reliability Engineer (Prisma Access)

About The Position

Requirements

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company