We are looking for an AI Evaluation & Test Engineer to ensure generative AI models and applications are safe, accurate, trustworthy, and deliver an elegant user experience. This role validates AI models and agents for accuracy, safety, bias, and performance through structured testing, benchmarking, and continuous evaluation pipelines. The engineer will be responsible for building and maintaining AI evaluation pipelines, implementing traces, spans, and session tracking for observability, defining AI quality metrics and KPIs, implementing evaluation and testing automation, defining and implementing release gates in the CI/CD pipeline, finding creative ways to break products, and assisting in root cause analysis and troubleshooting of bugs and field issues. The role also involves collaborating with cross-functional teammates from product, engineering, linguistics, and customer support to shape human-AI interaction paradigms and ensure desired outcomes and user experiences.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
Associate degree