Okta-posted 3 months ago
$168,000 - $227,000/Yr
Full-time • Senior
Bellevue, WA
Web Search Portals, Libraries, Archives, and Other Information Services

Our company is seeking a highly skilled Staff Site Reliability Engineer to join our team. We are a SaaS company specializing in securing large-scale systems. This role is a blend of software engineering and systems administration, where you'll be responsible for building and maintaining highly reliable, scalable, and secure infrastructure. You will be a key contributor, applying your expertise to automate manual processes and proactively solve complex problems before they become incidents, handling incidents, and includes on-call shifts.

  • Design, build, and maintain the core infrastructure that underpins our security SaaS offerings, ensuring high availability, performance, and scalability.
  • Develop robust automation using code to eliminate toil and ensure consistency across our environments.
  • Work closely with our security teams to embed a security-first mindset into all our processes and infrastructure.
  • Participate in on-call rotations and be a primary responder for critical incidents, leading root cause analysis and implementing preventative measures.
  • Partner with development, data science, and security teams to provide expert guidance on architectural decisions, best practices, and the implementation of new services.
  • Strong coding skills and comfortable writing production-level code.
  • Deep experience with Terraform for provisioning and managing cloud infrastructure and services.
  • Familiarity with modern CI/CD practices and tools, particularly Spinnaker.
  • Expertise in container technologies and hands-on experience managing large-scale, production-ready clusters with Kubernetes.
  • Experience with database schema management tools like Flyway.
  • Direct experience with large-scale data systems, specifically with the Snowflake platform.
  • Excellent analytical and problem-solving skills.
  • Experience or a strong interest in AI/ML, particularly how these technologies can be applied to improve reliability, security, and operational efficiency.
  • Health, dental and vision insurance
  • 401(k)
  • Flexible spending account
  • Paid leave including PTO and parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service