Staff Software Engineer, Observability

Gusto, Inc.•San Francisco, CA

63d•$200,000 - $270,000•Hybrid

About The Position

At Gusto, we're on a mission to grow the small business economy. We handle the hard stuff—like payroll, health insurance, 401(k)s, and HR—so owners can focus on their craft and customers. With teams in Denver, San Francisco, and New York, we’re proud to support more than 400,000 small businesses across the country, and we’re building a workplace that represents and celebrates the customers we serve. Learn more about our Total Rewards philosophy. Staff Observability Engineer Gusto’s Reliability Engineering team enables our product teams to build impactful products by building secure, resilient, and accessible systems, using tools like AWS, Terraform, Datadog, and Kubernetes. About Gusto Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management tools. Today, Gusto offices in Denver, San Francisco, and New York serve more than 200,000 businesses nationwide. Our mission is to create a world where work empowers a better life, and it starts right here at Gusto. That’s why we’re committed to building a collaborative and inclusive workplace, both physically and virtually. Learn more about our Total Rewards philosophy.

Requirements

Strategic systems thinker who identifies high impact opportunities and builds scalable solutions.
Experience operating large scale distributed systems in production, especially logging platforms or time series databases.
Strong fundamentals in systems, networking, and cloud infrastructure such as Kubernetes and AWS.
Thrive in ambiguous environments and roll up your sleeves to solve unscoped problems end to end.
Product mindset or full stack instincts and excited to build real tools engineers love to use.
Strong communicator who can align technical and non technical stakeholders.
Have 8+ years of relevant industry experience building and operating large-scale observability or monitoring infrastructure
Experience implementing or operating observability platforms such as Datadog, Sentry, Splunk, or similar.
Have strong SWE coding proficiency in at least one of Ruby, Python, or TypeScript.

Nice To Haves

Bonus if you have built or contributed to observability ecosystems such as OpenTelemetry or Prometheus
Cloud infrastructure experience (AWS preferred).
Infrastructure-as-Code, such as Terraform or Crossplane.
Container orchestration, such as Kubernetes.
Linux systems knowledge and shell proficiency.
Experience designing highly available and scalable systems.
Experience acting as an internal consultant or trusted advisor.

Responsibilities

Shape the engineering organization standards around observability.
Own and evolve the observability platform, including distributed logging, metrics, and tracing infrastructure.
Build AI-native capabilities to automatically detect anomalies, diagnose failures, and accelerate root cause analysis.
Create powerful developer experiences through dashboards, notebooks, and interactive debugging tools.
Drive reliability automation with intelligent alerting, diagnostics, and incident response systems.
Partner across engineering teams to embed observability and reliability best practices.
Mentor engineers and influence reliability culture across the organization.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume