Senior Site Reliability Engineer

Glean TechnologiesSan Francisco Bay Area, CA
107d$155,000 - $250,000

About The Position

We are seeking a skilled and motivated Senior Site Reliability Engineer (SRE) to become a valuable addition to our dynamic and innovative team. As a SRE, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services and applications. You will work closely with our engineering teams to design, build, and maintain robust, scalable, and highly available cloud infrastructure. Much of our software development focuses on building infrastructure to scale our operations in a hybrid cloud environment and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale and fast growth which are unique to Glean, while using your expertise in coding, algorithms, problem-solving, and SRE practices. We keep Glean applications up and running, ensuring our customers have the best and most reliable experience possible.

Requirements

  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
  • 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role, particularly in managing cloud-based services and infrastructure.
  • 5+ years of experience with software development in one or more programming languages.
  • 2+ years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems running in Cloud.
  • Strong knowledge of cloud platforms such as Google Cloud Platform, AWS, or Azure.
  • Practical experience with containerization technologies, including Docker and Kubernetes.
  • Familiarity with infrastructure as code tools like Terraform is essential.
  • Solid understanding of networking, security principles, and best SRE and security practices.
  • Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively.

Responsibilities

  • Play a key role in driving technical excellence and fostering a culture of reliability across engineering teams.
  • Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure.
  • Participate in primary oncall rotation; cultivate technical curiosity and growth mindset, and a blameless postmortem culture within the team.
  • Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks.
  • Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
  • Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
  • Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues.
  • Engage actively in the entire software development lifecycle and provide valuable SRE insights during launch reviews.

Benefits

  • Competitive compensation
  • Medical, Vision, and Dental coverage
  • Generous time-off policy
  • 401k plan
  • Home office improvement stipend
  • Annual education and wellness stipends
  • Vibrant company culture through regular events
  • Healthy lunches daily

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Publishing Industries

Education Level

Bachelor's degree

Number of Employees

501-1,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service