Sr. Site Reliability Engineer

IllumioSunnyvale, CA
Onsite

About The Position

Illumio is a leader in ransomware and breach containment, redefining how organizations contain cyberattacks and enable operational resilience. Powered by the Illumio AI Security Graph, their breach containment platform identifies and contains threats across hybrid multi-cloud environments. The Engineering team is shaping the future of cybersecurity, fostering a culture of innovation, autonomy, and ownership. As a leader in Zero Trust Segmentation, Illumio is redefining security for a world facing unprecedented cyber threats. The role involves working with a highly scalable SaaS service built using cloud-native technologies while also shipping the solution on-premises. The engineering philosophy emphasizes disciplined engineering, focus, and enabling ownership at all levels.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field; or equivalent work experience
  • 6+ years of experience working as a Site Reliability Engineer (SRE) or similar role, with a focus on Azure cloud platform
  • Hands-on experience in designing, deploying, and managing Azure infrastructure, including compute, storage, networking, and security services
  • Proficiency in scripting and programming languages such as PowerShell, Python, or Go for automation and infrastructure management tasks
  • Strong understanding of CI/CD principles and experience with tools such as Azure DevOps, Jenkins, or GitLab CI/CD
  • Excellent analytical, problem-solving, and communication skills, with the ability to collaborate effectively with cross-functional teams

Nice To Haves

  • Experience with containerization technologies (e.g., Docker, Kubernetes) and microservices architecture in Azure environments is a plus
  • Azure certifications such as Azure Solutions Architect, Azure DevOps Engineer, or Azure Security Engineer are preferred

Responsibilities

  • Design, deploy, and maintain highly available and scalable infrastructure solutions on Azure cloud platform to support our applications and services
  • Implement infrastructure as code (IaC) principles using tools such as Terraform or ARM templates to automate provisioning and configuration management
  • Develop and maintain robust CI/CD pipelines for automated software delivery and deployment, ensuring efficiency and reliability throughout the release process
  • Monitor system performance, application health, and infrastructure metrics using Azure monitoring and logging services, and implement proactive measures to optimize performance and availability
  • Lead incident response and resolution efforts, conducting root cause analysis, implementing corrective actions, and documenting post-incident reviews
  • Collaborate closely with development teams to design and implement scalable and reliable architectures, providing guidance on best practices for cloud-native application development
  • Implement security best practices and controls in Azure environments to protect data, applications, and infrastructure, and ensure compliance with regulatory requirements
  • Drive continuous improvement initiatives to enhance reliability, scalability, and efficiency of Azure infrastructure and services, leveraging automation and emerging technologies
  • Provide mentorship and guidance to junior team members, fostering a culture of learning, collaboration, and innovation within the SRE team
  • Stay current with Azure cloud platform updates, trends, and best practices, and evaluate emerging technologies for potential adoption to drive innovation and efficiency
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service