Site Reliability Engineer (SRE)

SonioBoston, MA
$165,000 - $190,000Hybrid

About The Position

Sonio is seeking its first Site Reliability Engineer (SRE) and first engineer in the US to own the platform’s stability and releases, particularly during PST hours. This role requires a hybrid profile, combining system administration and software engineering skills. The SRE will operate with high autonomy, making critical decisions during incidents and ensuring the production environment is state-of-the-art, secure, and resilient. The position reports to the Lead DevOps Engineer and involves bridging infrastructure and code by working with Kubernetes, Terraform, and AWS, with the ability to read and patch Elixir code. Key responsibilities include driving incident response end-to-end, improving platform operability through SLO definition and alert tuning, enhancing observability, transferring operational knowledge from France to the US via runbooks and documentation, and supporting compliance and security in a regulated medical-device environment with HIPAA-aligned controls.

Requirements

  • 4+ years of experience in SRE, DevOps, or Production Engineering, including significant on-call experience on a 24/7 product
  • Hybrid "code-literate" mindset, acting as an infrastructure expert who can also navigate a backend codebase to triage and patch issues independently.
  • Strong technical foundations in Kubernetes, Terraform, and AWS, along with the ability to architect and tune your own observability signals.
  • Highly autonomous and comfortable making technical decisions with limited supervision.
  • Operational rigor and ability to stay calm under pressure.
  • Written English skills necessary to produce high-quality runbooks and handle async handoffs.
  • Interest in Sonio's mission.

Responsibilities

  • Own US coverage for releases and incidents as the first responder during PST hours.
  • Bridge infra and code by working hand-in-hand with our DevOps team on Kubernetes, Terraform, and AWS, while being able to read and patch Elixir code to unblock yourself without waiting for a backend engineer.
  • Drive incident response end-to-end, managing triage, mitigation, and blameless post-mortems with real follow-through.
  • Improve the platform’s operability by defining SLOs, tuning alerts to reduce toil, and pushing observability (metrics, logs, tracing) where it’s lacking.
  • Transfer operational knowledge from France to the US by authoring runbooks and documenting procedures so local teams are empowered to act when something breaks.
  • Support compliance and security in our regulated medical-device environment, maintaining HIPAA-aligned controls and an audit-ready infrastructure.

Benefits

  • Health Insurance (Medical plan, vision, dental) - up to 30,000$ per year + FSA & HSA
  • 401(k) - up to 4% of your salary matched
  • Life Insurance - covering 2 times your salary, up to $200k
  • An attractive Parental Policy for primary and secondary caregivers
  • 20 PTO + 1 week offered between Christmas and New Year
  • Offices in Boston (HQ) & New York (incl. free breakfast, drinks & gym)
  • Flexible hours & remote policies
  • Commuter Benefits
  • One offsite per year in France & regular team building with US team
  • Ongoing trainings and continuous opportunities for professional growth and development, specifically unlimited access to coaching
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service