Join a resiliency-focused engineering team that is building practical, production-grade GenAI capabilities to make operations faster, safer, and more predictable. This Technical Lead III role sits at the intersection of Site Reliability Engineering and platform engineering, with significant greenfield work to design, prove, and scale AI-driven approaches to incident response and operational workflows. You will lead hands-on delivery across the full lifecycle: selecting and validating model fit (including evaluation and testing), building AWS AI-enabled platform components (for example, using Amazon Bedrock and Model Context Protocol (MCP) patterns), and integrating those capabilities into real SRE tooling such as runbook automation, remediation workflows, and internal self-service experiences. The goal is simple: measurable reliability outcomes, delivered through well-engineered automation that teams can trust and adopt.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level