Founding Site Reliability Engineer

Assort HealthSan Francisco, CA
70d

About The Position

At Assort Health, we believe healthcare should feel effortless and connected — quick answers, clear communication, and seamless access to care. That’s why we’re building a new foundation for how patients and providers connect, driven by AI, built to embrace the complexities of healthcare, and tailored to each provider’s unique needs. Assort is the most comprehensive patient experience platform powered by specialty-specific agentic AI. Assort’s omnichannel AI agents seamlessly integrate with EHR/PMS and complicated provider preferences to eliminate lengthy hold times and inefficiencies that stand in the way of patients getting the care they need. Since launching in 2023, Assort has managed over 45M+ patient interactions, slashing average hold times from 11 minutes to 1 minute. Our platform now handles calls for thousands of providers with 98%+ resolution rates and 99% scheduling accuracy. Patient satisfaction averages 4.4/5, and we’ve achieved 11× revenue growth since Q4 2024. We’re scaling rapidly and expanding adoption across the entire healthcare industry.

Requirements

  • 3+ years focused on reliability, SRE, or production infrastructure
  • Hands-on experience running production systems in startups or growth-stage companies (not just large enterprises)
  • Comfortable balancing firefighting with strategic reliability improvements
  • Cloud infrastructure experience (GCP preferred; AWS or Azure fine)
  • Implemented or maintained observability stacks (Datadog, Prometheus, Grafana, Honeycomb, OpenTelemetry, Sentry, PagerDuty)
  • Can code and automate
  • Comfortable with Kubernetes

Nice To Haves

  • Infrastructure-as-code (Terraform strongly preferred)
  • CI/CD pipelines and modern deployment strategies
  • Early-stage/high-growth experience
  • Exposure to security, compliance, or resilience architectures
  • Voice infra experience (Twilio, etc.)

Responsibilities

  • Define, own, and improve SLIs / SLOs / error budgets — set measurable targets around availability, latency, and error rates, and drive toward achieving them
  • Build and maintain observability across the stack (metrics, logging, tracing, dashboards, alerts, anomaly detection) and lead incident management — coordinating responses, improving runbooks and postmortems, automating with AI tools, and collaborating with partners like Deepgram, Cartesia, GCP, and EHRs to ensure capacity
  • Reduce operational toil by automating repetitive tasks and building self-healing systems and remediation workflows
  • Improve deployment safety through canary or blue/green rollouts, automated rollbacks, chaos experiments, and deployment guardrails
  • Contribute to infrastructure work: IaC, cloud architecture, networking, autoscaling, and related systems
  • Ensure reliability across services, databases, caches, queues, third-party integrations, and networks
  • Drive capacity planning, performance tuning, and cost optimization
  • Mentor others on reliability best practices and champion a reliability mindset across engineering

Benefits

  • Competitive Compensation – Including salary and employee stock options so you share in our success.
  • Lifelong Learning – Annual budget for professional development, plus training opportunities to help you grow.
  • Office Setup Stipend – We’ll outfit your in-office workspace so comfy as it's productive.
  • Top-Tier Health Coverage – Medical, dental, and vision insurance, because your health comes first.
  • Unlimited PTO – We trust you to take the time you need to recharge and come back ready to crush it.
  • Meals & Snacks – Lunch, dinner, and snack breaks that fuel great ideas.
  • Wellness Stipend – Your physical and mental well-being matters, and we’ve got a yearly stipend to prove it.
  • 401(k) – Let us help you plan for the future. We’ve got you covered.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service