We're looking for a hands-on Reliability Tech Lead (IC) to own the mission of making Cerebras Inference the most reliable AI service in the world. You will drive reliability strategy and execution across our inference stack, from client SDKs and public-cloud multi-region deployments to wafer-scale systems in specialized data centers. In this role, you will define SLOs and incident-response frameworks, design and implement reliability mechanisms at scale, and partner across hundreds of engineers to ensure our service meets world-class reliability standards. If you are passionate about building and operating massive-scale, low-latency, high-reliability distributed systems, we want to hear from you.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal
Industry
Computer and Electronic Product Manufacturing