Site Reliability Engineer

Origami Risk LLCChicago, IL
1h$100,000 - $120,000Hybrid

About The Position

The Site Reliability Engineer is a key force behind improving Origami’s time to resolution and advancing overall site reliability and scalability. This person participates in efforts to identify root causes during post-incident investigations, while also identifying preventative measures to minimize future disruptions. They also assist with identifying root causes in performance challenges in client implementations and implement methods for tracking key performance metrics across clients. Starting base pay for this role is between $100,000 and $120,000. The actual base pay is dependent upon many factors, such as transferable skills, work experience, business needs, training, location, and market demands. The base pay range is subject to change and may be modified in the future. This role will be eligible for a bonus as well as competitive medical, dental, and vision benefits, wellness reimbursement, life insurance, and a 401(k) with company match. We offer vacation and sick leave benefits (under a flexible time off policy in most states).

Requirements

  • Bachelor's degree in Computer Science or related field (or equivalent experience)
  • 5+ years of proven experience in a Site Reliability Engineering role.
  • Strong knowledge of SRE best practices and incident management protocols
  • Deep experience using and/or configuring New Relic, Data Dog, SumoLogic or similar observability tools
  • Proficiency in reading and writing code (e.g., JavaScript, .NET, SQL)
  • Familiarity with cloud platforms (e.g., AWS, Azure) and architectural patterns
  • Excellent problem-solving skills and a data-driven approach to incident analysis
  • Prior experience operating within a Public Cloud environment (AWS strongly preferred)
  • Experience troubleshooting C#/.Net based web applications to identify bugs/performance challenges.
  • Solid knowledge of SaaS operations
  • Ability to succeed when facing ambiguity and differing levels of operational maturation
  • Advanced written and verbal communication skills

Nice To Haves

  • Windows and SQL-server troubleshooting skills preferred
  • Knowledge of Continuous Integration and Continuous Delivery (CI/CD) pipelines preferred
  • Experience working in an Infrastructure as a Code (IaC) environment preferred

Responsibilities

  • Leads post-incident investigations for the Site Reliability team.
  • Conducts in-depth post-incident analyses to identify root causes and develops preventive strategies.
  • Drafts clear and insightful RCAs for customer delivery.
  • Cross trains colleagues on how to best leverage observability tools during incident and performance investigations.
  • Provides visibility to all stakeholders throughout the entire Site Reliability process.
  • Collaborates with cross-functional teams to implement system enhancements that enhance scalability and stability.
  • Develops client-focused dashboards/alerts to proactively identify performance challenges.
  • Monitors and continuously improves our time to resolution metrics.
  • Maintains and configures core observability tools to ensure optimum performance and key metrics/data are available for incident response and performance investigations.
  • Provides an actionable feedback loop to Observability and Engineering teams toward improving MELT and development patterns.
  • Contributes to the development of automation tools to streamline incident response.
  • Works proactively to prevent incidents and reduce their impact on our platform.
  • Partners with the larger Cloud Operations, SRE, Engineering teams, and the business-at-large to advance our SaaS platforms.
  • Other duties as assigned.

Benefits

  • Medical and Dental coverage available for employees, dependents, domestic partners, and spouses
  • Paid Time Off – Flexible options plus 10 paid company holidays where available
  • All full-time positions are hybrid, with many eligible to be completely remote
  • Fully Paid by Origami Risk – Vision insurance, Short & Long-Term Disability Insurance, and Basic Life Insurance
  • Generous family leave options—including adoption and foster care placements
  • Pre-Tax Savings Accounts – Flexible Spending Account, Health Savings Account, Commuter Benefits, Dependent Care Savings Account
  • Retirement Savings – 401(k) with company match up to 4%
  • Employee Assistance Program (EAP) – Confidential & Free support offered to colleagues facing personal or work-related complications
  • Education Assistance Program – to help colleagues pursue industry/role-specific certifications
  • Wellness Benefits – reimbursement program to invest in healthy habits as well as support better colleague productivity and stress management
  • Additional coverages available – Pet Insurance, Critical Illness Insurance, and Voluntary Life & AD&D coverage
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service