We’re looking for a Senior/Staff Site Reliability Engineer to build Mochi’s AI-driven APM and incident management system that alert and page, but learns. This is a foundational role at the intersection of SRE, platform engineering, and applied AI: you’ll design the feedback loops (human-in-the-loop / RLHF-style), guardrails, and automation that let our reliability posture improve over time. You’ll own the systems and workflows that turn incidents into intelligence: automated triage, root cause analysis, remediation, and bug-fix proposals (PRs, test runs, staged rollouts) when issues are code-level. If you’re excited by the idea of building a self-improving SRE “copilot”, this job is for you.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed
Number of Employees
11-50 employees