Overview Walmart’s Next Gen Commerce team is building the future of conversational shopping with intelligent agents that reason, recommend, and proactively assist customers. As a Principal Data Scientist for Quality & LLM Judging Systems, you will serve as the technical lead for defining and measuring the success of these AI systems. You will be responsible for designing the "brain" that critiques our agents, utilizing a mix of LLM-as-a-judge frameworks, human benchmarks, and automated pipelines. In this high-impact, hands-on role, you will partner closely with engineering and product leaders to translate subjective quality goals into rigorous, actionable metrics that drive model improvement and safe deployment.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal
Number of Employees
5,001-10,000 employees