Staff Site Reliability Engineer

Netskope-posted 13 days ago

Full-time • Mid Level

Santa Clara, CA

1,001-5,000 employees

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

About Netskope Today, there's more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realized a new perimeter was needed, one that is built in the cloud and follows and protects data wherever it goes, so we started Netskope to redefine Cloud, Network and Data Security. Since 2012, we have built the market-leading cloud security company and an award-winning culture powered by hundreds of employees spread across offices in Santa Clara, St. Louis, Bangalore, London, Paris, Melbourne, Taipei, and Tokyo. Our core values are openness, honesty, and transparency, and we purposely developed our open desk layouts and large meeting spaces to support and promote partnerships, collaboration, and teamwork. From catered lunches and office celebrations to employee recognition events and social professional groups such as the Awesome Women of Netskope (AWON), we strive to keep work fun, supportive and interactive. Visit us at Netskope Careers. Please follow us on LinkedIn and Twitter @Netskope . About the role We are a team of software engineers focused on improving reliability, availability, latency, performance, efficiency, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you. As a SRE, you will be writing software to solve operational problems and drive cutting edge reliability and observability practices. Your expertise will also extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client-facing APIs. You will ensure that training environments are optimally available and efficiently managed across multiple clusters, enhancing our containerization and orchestration systems with advanced tools like Docker and Kubernetes.

Partner closely with service owners and engineers to develop reliable services driven by best practices
Develop software and tools to solve a variety of problems across service and infrastructure
Set up and manage monitoring, logging, and alerting systems for extensive training runs and client-facing APIs.
Ensure training environments are consistently available and prepared across multiple clusters.
Develop and manage containerization and orchestration systems utilizing tools such as Docker and Kubernetes.
Improve reliability, quality, and time-to-market of our suite of software solutions
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement
Provide primary operational support and engineering for multiple large-scale distributed software applications

Software programming experience in any programming language
Good understanding of principles of distributed systems
Deep understanding of Kubernetes and Docker
Understanding of data technologies like Kafka, Yugabyte, Redis etc
Good understanding of AWS ecosystem
Basic understanding of networking
Exposure to Infrastructure as code tools like Terraform
Familiar with monitoring tools such as Prometheus, Grafana, or similar
8+ years building core infrastructure
Someone who works with a sense of ownership
Takes pride in building and operating scalable, reliable, secure systems
Are comfortable with ambiguity and change
You have a knack for troubleshooting complex systems and enjoy solving challenging problems
Proactive in identifying problems, performance bottlenecks, and areas for improvement
Has experience in working and collaborating with teams based across different geographies and time zones

Experience in operating and monitoring services communicating across AWS and private clouds
Experience operating Kubernetes at scale

Track Jobs with Teal

Job Search Resources

•

AI Resume Builder

•

Site Reliability Engineer Resume Examples

•

Site Reliability Engineer Cover Letter Examples

Staff Site Reliability Engineer

Job Search Resources

Tools

Career Hubs

Guides

Company