Process - AI Training Manager

AT&T•Richardson, TX

4d•$87,200 - $130,800•Onsite

About The Position

This position requires office presence of a minimum of 5 days per week and is only located in the location(s) posted. No relocation is offered. At AT&T, we empower leaders to drive change in a fast-evolving, connected world. Your strategic vision will help serve customers and transform lives through innovative solutions and impactful connections. The Sr Specialist Quality/M&P/Process - AI Training Manager is responsible for overseeing the training, validation, and continuous improvement of Agentic Capabilities—AI-powered agents designed to autonomously process a variety of ticket types within workflow management systems. This role owns AI agent quality: ensuring reliable, high-quality outcomes; rapidly reviewing and resolving exception (“fallout”) tickets; applying corrections and re-ingesting updates; improving training data and fine-tuning artifacts; updating the agent knowledge base; and rerunning tickets to validate fixes—continuously strengthening agentic capabilities.

Requirements

Understanding of the business function and M&Ps; align AI behavior with policy and process; plus knowledge of supported workflows and tools with experience operating the systems involved (e.g., ticketing/workflow platforms).
Strong analytical and problem-solving skills with meticulous attention to detail; proven root cause analysis (RCA) capability.
General AI literacy and understanding of agentic systems; basic prompt engineering (iteration, testing, versioning).
Ability to manage fallout within SLAs, triage tickets, and drive rapid resolution; strong prioritization in fast-paced environments.
Observability: proficiency with logs, metrics, dashboards, and alerts; define and track quality KPIs (accuracy, fallout rate, MTTR).
Basic scripting understanding to automate corrections, content re-ingestion, and validation workflows.
Knowledge base authoring and maintenance; clear documentation of training methods, resolutions, and changes for auditability.
Compliance/data privacy/ethical guidelines awareness; maintain auditable processes and change logs.
Effective communication: synthesize findings, report metrics, and present recommendations to stakeholders.

Nice To Haves

Advanced observability (distributed tracing, SLO/SLA design) and incident response practices.
Experiment tracking and ML operations tooling, feature flags, canary/rollback strategies.
Familiarity with fine-tuning pipelines, retrieval/RAG, vector databases, and content ingestion pipelines.
SQL/BI tools for advanced analytics and dashboarding; ability to build executive-ready reports.
Version control (Git) for prompts, KB content, and evaluation artifacts; change management discipline.
Workflow orchestration for scheduled re-ingestion, evaluations, and reporting.
Experience in training, quality assurance, documentation, or knowledge management, including taxonomy/ontology design.
Advanced scripting/automation and experience writing/maintaining Markdown-based runbooks and KB articles.
Prior experience with AI in production settings and A/B testing platforms.

Responsibilities

Monitor the performance of Agentic Capabilities as they autonomously process various ticket types.
Ensure seamless integration of AI agents into new or existing workflows, optimizing for efficiency and accuracy.
Review fallout tickets (cases where AI agents cannot resolve issues) within a workflow management tool.
Diagnose root causes, make necessary corrections, and re-ingest updated information to the AI system.
Ensure all fallout tickets are actioned within a 48-hour window; unresolved tickets revert to the human-worked queue.
Analyze fallout patterns to identify knowledge gaps, process inefficiencies, or opportunities for AI improvement.
Develop and implement training protocols to enhance Agentic Capabilities, leveraging prompt engineering, model validation, and knowledge base updates.
Collaborate with cross-functional teams (product, engineering, support) to align AI behaviors with business needs and compliance requirements.
Maintain and update the agent knowledge base, ensuring accurate, current, and comprehensive content for AI agents.
Document training methodologies, ticket resolutions, and process improvements for knowledge sharing and auditing.
Validate AI performance through systematic review, testing, and user/stakeholder feedback.
Ensure all processes comply with regulatory standards, ethical guidelines, and company policies.
Track and report on key metrics: ticket resolution rates, fallout frequency, review turnaround times, and AI improvement outcomes.
Communicate insights, best practices, and recommendations to stakeholders and leadership.

Benefits

Medical/Dental/Vision coverage
401(k) plan
Tuition reimbursement program
Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)
Paid Parental Leave
Paid Caregiver Leave
Additional sick leave beyond what state and local law require may be available but is unprotected
Adoption Reimbursement
Disability Benefits (short term and long term)
Life and Accidental Death Insurance
Supplemental benefit programs: 8critical illness/accident hospital indemnity/group legal
Employee Assistance Programs (EAP)
Extensive employee wellness programs
Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet (and fiber where available) and AT&T phone