Member of Technical Staff - Enterprise Model Evaluation

xAI•Palo Alto, CA

2d•Onsite

About The Position

About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. About the Role The Model Evaluations team aims to design and implement xAI’s evaluations that shapes how we understand, measure and improve our model’s capabilities. You will work at the intersection of research and product to develop and implement model evaluations that give us high signal into merging model capabilities and robust evaluation infrastructure that enables fast iterations of our models. Your work will be essential to xAI’s mission of understanding the universe. You will collaborate closely with the training and product teams to ensure our models meet the highest standards before deployment. This is a technical leadership role where you will be expected to drive both the vision and implementation of our model evaluations.

Nice To Haves

Proven expertise in designing and implementing sophisticated evaluation frameworks for machine learning models, especially LLMs.
Experience with statistical analysis, experimental design, and benchmarking AI systems in real-world settings.

Responsibilities

Design and implement next-generation evaluation suites beyond traditional benchmarks, creating frameworks that capture real-world utility and performance of Grok in production environments.
Coordinate model evaluation efforts and collaborations to ensure comprehensive coverage and fast iterations.
Integrate Grok into production systems, gain deep insights into real-world environments, and ensure alignment with user needs and business objectives.
Partner with research teams to translate cutting-edge techniques and Grok models into production-ready implementations, optimizing for performance and impact.

Benefits

Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume