Customer Service & Support AI Rater & Evaluator

LILT (Production)•Berlin, MA

1d•Remote

About The Position

LILT is building a global network of domain experts to support high-quality AI evaluation across training, benchmarking, red-teaming, and ongoing model monitoring. We are seeking customer service and support professionals to contribute expert judgment to human-in-the-loop AI evaluation workflows used by leading enterprises and hyperscalers. This role is designed for professionals who understand how customer support interactions work in real operational environments and who can apply that expertise to evaluate, assess, and improve multilingual AI systems used in customer-facing contexts. Your contribution of expertise will directly influence multilingual AI model quality, safety, and deployment readiness. This role includes two distinct expert tracks, based on experience level and scope of responsibility. Track A: Customer Service & Support AI Rater Raters execute structured evaluation tasks using clearly defined rubrics and instructions. Track B: Customer Service & Support AI Evaluator (Senior Track) Evaluators provide higher-level domain oversight and help shape how evaluation is performed. AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world. Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.

Requirements

Customer support professionals, service operations specialists, or CX practitioners
Experience handling customer inquiries, support workflows, or service escalation
Strong attention to detail and comfort working with structured evaluation criteria
Senior support leaders, CX managers, or service quality specialists
Experience defining support standards, reviewing complex edge cases, or managing escalations
Ability to clearly explain nuanced service decisions and tradeoffs
Deep domain expertise in customer service, support operations, or CX
Strong judgment and ability to apply criteria consistently
Comfort working with structured evaluation workflows
Ability to explain reasoning clearly, especially in sensitive customer scenarios
Reliability, professionalism, and respect for quality standards
Native or professional fluency in one or more supported languages is required
Supported languages span 30+ global languages
English fluency is required for guidelines, feedback, and collaboration

Responsibilities

Evaluate AI outputs related to customer service and support interactions
Perform structured scoring, comparison, classification, and judgment tasks
Assess accuracy, clarity, tone, helpfulness, and alignment with support best practices
Identify hallucinations, misleading responses, policy violations, or unsafe guidance
Apply domain-specific customer support guidelines consistently across tasks
Validate and refine evaluation rubrics and edge-case handling
Perform adjudication where raters disagree
Conduct error analysis and qualitative reviews of model behavior
Partner with LILT research, product, and customer teams on evaluation design
Support red-teaming, policy alignment, and model readiness assessments