Senior Site Reliability Engineer

ProsperSan Francisco, CA

About The Position

You will be a senior technical contributor on the SRE team, responsible for the reliability, scalability, and security of Prosper’s Cloud Platform portfolio. This is as much of a platform engineering role as it is SRE role — you will maintain the applications that run on our platform, drive alignment to platform standards, and ensure services stay current within the framework and dependency realm. We are building an agentic AI-first operations model where AI agents handle investigations, deployments, audits, and optimizations — and you will be at the center of designing and governing that system. You will share the ownership of application-layer reliability, CI/CD pipelines, and observability while simultaneously building the skills, rules, and guardrails that allow AI agents to operate safely alongside human engineers.

Requirements

  • 7+ years in SRE, DevOps, or Platform Engineering
  • Deep expertise with a major cloud provider (GCP preferred) and Kubernetes
  • Strong infrastructure-as-code experience with multi-environment patterns
  • Production CI/CD pipeline design
  • Observability and APM platform experience
  • Strong written communication — your documentation will be consumed by humans and AI agents alike

Nice To Haves

  • Experience building or integrating AI agents into operational workflows
  • Hands-on with LLM-powered development tooling
  • Background in designing guardrails or policy engines for automated systems
  • Track record of building internal developer platforms or self-service infrastructure

Responsibilities

  • Design and author AI agent skills — structured playbooks that encode investigation, deployment, and optimization workflows
  • Own application-layer reliability within Kubernetes-based compute (managed by the Infrastructure Engineering team) across all environments
  • Maintain and upgrade platform applications — drive framework upgrades, dependency updates, and alignment to platform standards
  • Drive infrastructure-as-code with modular, multi-environment patterns
  • Participate in on-call rotation and lead incident response
  • Build and maintain observability across cloud monitoring and APM platforms
  • Own the Internal Developer Platform — CI/CD pipelines, deployment tooling, and developer self-service
  • Mentor junior SRE engineers and shape team standards

Benefits

  • Flexible time off
  • Comprehensive health coverage
  • Competitive salary
  • Paid parental leave
  • Wellness benefits including access to mental health resources, virtual HIIT and yoga workouts
  • Udemy access
  • Childcare assistance
  • Pet insurance discounts
  • Legal assistance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service