Software QA Lead – AI & GenAI Platforms

AECOM•Chicago, IL

6h•$150,000 - $190,000•Hybrid

About The Position

Build the quality engine behind enterprise AI. We’re looking for a hands-on QA leader to drive the strategy and execution of quality assurance for enterprise-grade Generative AI assistant applications. You’ll own how AI systems are tested, validated, and scaled, spanning full-stack applications powered by LLMs and RAG workflows. This role is critical to ensuring reliable AI outputs, strong user experiences, and responsible, ethical AI behavior in production. This position will offer flexibility for hybrid work schedules to include both in-office presence and telecommute/virtual work, to be based in Chicago, IL.

Requirements

Bachelor's Degree plus at least 10 years of software development experience, or demonstrated equivalency of experience and/or education
Proven experience leading QA for complex, full-stack applications
Strong expertise in testing Generative AI systems (LLMs, RAG workflows)
Hands-on experience with test automation tools (e.g., Selenium, Appium)
Experience with API testing, distributed systems, and CI/CD pipelines
Familiarity with performance testing tools (e.g., JMeter, LoadRunner)
Strong analytical, problem-solving, and communication skills

Nice To Haves

Experience with AI evaluation techniques (prompt testing, output scoring, bias testing)
Background in conversational AI, NLP, or chatbot testing
Familiarity with enterprise data/retrieval systems and knowledge bases
Experience working in cloud environments (AWS, Azure, or GCP)

Responsibilities

Define and execute QA strategy for end-to-end AI applications (frontend, backend, and AI workflows)
Lead validation of LLM outputs for accuracy, coherence, and reliability across diverse scenarios
Test and optimize RAG pipelines to ensure high-quality retrieval and grounded responses
Design user-centric test scenarios (multi-turn conversations, edge cases, failure handling)
Conduct performance, load, and latency testing across AI and application layers
Build and scale automated testing frameworks (API, UI, integration) within CI/CD pipelines
Establish and track quality metrics (accuracy, relevance, latency, user satisfaction)
Ensure ethical AI standards, including bias detection and responsible output validation
Partner cross-functionally to embed QA practices across the development lifecycle