As an Engineer in Site Reliability Engineering (SRE) for AI Systems, you will help ensure the reliability, scalability, and performance of AI platforms. This role includes participating in on-call rotations, improving system observability, and supporting operations across cloud-native infrastructure. This is a hands-on role ideal for someone with foundational SRE skills and a growth mindset to expand in GenAI and LLM infrastructure operations. We pride ourselves on encouraging a culture of innovation, advocating for agile methodologies, and promoting transparency in all that we do. Join us in embodying the spirit of the 'Un-carrier' and make a tangible impact! Our team is dynamic where no day is the same, and we are diverse and inclusive passionate about growth and transformation. If you're up to the challenge, apply today!
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees