Principal Site Reliability Engineer

Palo Alto Networks•Office - USA - CA - Headquarters, CA

1d•$151,600 - $245,300•Onsite

About The Position

Your Career Palo Alto Networks runs a large hybrid infrastructure and is one of the largest GCP customers. As a Site Reliability Engineer, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, metrics, troubleshooting, security, and reliability. Our stack includes Kubernetes, Docker, GCP, AWS, Ansible, Terraform, Vault, Gitlab, Spinnaker, Pub/sub, Bigtable, Memorystore, Bigquery, RabbitMq, Kafka, MySQL, Python, and Go. We don’t expect you to know all these, but we do expect you to learn the ones needed for this role. The Team Wildfire is the industry's largest cloud-based malware protection engine that uses machine learning and crowdsourced intelligence to instantly prevent up to 95% of unknown malware variants inline without compromising business productivity. Wildfire infrastructure team supports the scalability and high availability of Wildfire clouds.

Requirements

BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience
Expertise in configuration management with a framework such as Ansible, Terraform, Helm, Kubernetes
Proficient in Python and/or Go
Expertise in managing applications in the Kubenetes cluster with autoscaling enabled
Experience in Production Engineering, DevOps, or Site Reliability
Expertise in the public cloud (GCP or AWS), especially in GCP
Strong Linux administration, internals, and network troubleshooting
Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
Ability to diagnose and troubleshoot complex distributed systems handling high-volume transactions
Excellent written and verbal communication, able to collaborate and rally support
Self-disciplined, self-managed, self-motivated, and strong sense of ownership, urgency, and drive
Passion for infrastructure and monitoring as code
Ready to understand and dissect new technology stacks quickly