Role summary: The (USA) Principal, Site Reliability Engineer leads the design, development, and implementation of reliability programs for complex site environments. This role ensures system performance, scalability, and disaster recovery through advanced monitoring, root cause analysis, and infrastructure automation. The position requires expertise in software architecture, distributed systems, and cloud technologies to optimize operational efficiency and resilience. The Principal Engineer collaborates across teams to drive continuous improvement, establish reliability standards, and support business objectives by delivering robust, scalable, and secure solutions aligned with organizational goals. About the team: The CES team delivers exceptional customer service experiences to millions of Walmart customers and agents worldwide. Comprising software engineers, data scientists, and machine learning experts, the team advances GenAI technology within complex enterprise applications. As part of Walmart Global Tech’s Enterprise Business Systems, CES collaborates closely with product, business, and UX teams to drive measurable business outcomes. The team focuses on innovation, reliability, and scalability to support Walmart’s mission of helping customers save money and live better through cutting-edge technology and robust site reliability engineering practices.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal
Number of Employees
1-10 employees