Lead Machine Learning Engineer

The Walt Disney Company•Orlando, FL

23d•Onsite

About The Position

At Disney Experiences Technology, our team creates world-class immersive and digital experiences for the Company’s vacation brands, Disney’s Parks and Resorts worldwide, Disney Cruise Line, Aulani, A Disney Resort & Spa, and Disney Vacation Club. The Disney Experiences Technology team is responsible for the end-to-end digital and physical Guest experience for all technology & digital-led initiatives across the Attractions & Entertainment, Food & Beverage, Resorts & Transportation, and Merchandise lines of business as well as other initiatives including the MyDisneyExperience app and Hey, Disney! The team is seeking a results-oriented and hands-on Lead Machine Learning Engineer to design, develop, and deploy high-impact AI/ML solutions that drive measurable business value across our entertainment company. In this role, you will lead complex, cross-functional projects with a strong emphasis on reuse, scalability, reliability, and performance. The Lead ML Engineer will report to the ML Engineering Manager. About The Role & Team: The DXT AI Technology Platform team is responsible for building an AI enablement platform for the DX segment that provides streamlined AI & Generative AI capabilities for the segment to build solutions around and on top of. The Lead Machine Learning Engineer will design, develop, implement enterprise grade and robust AI/ML solutions, including agentic systems, multi-modal models, RAG, and Responsible AI applications. This position is in office.

Requirements

7+ years of proven expertise in designing, building, and deploying AI/ML solutions at scale, with 1-2 years of production experience in Generative AI technologies.
Strong foundation in machine learning including statistical modeling, supervised and unsupervised learning algorithms.
Advanced skills in prompt engineering with deep understanding of optimization techniques and best practices for LLM interactions.
Expert-level programming proficiency in Python and AI/ML development ecosystems.
Deep expertise in modern AI frameworks including LLM application development and agentic systems (LangChain, CrewAI, or similar).
Comprehensive MLOps experience with hands-on implementation of CI/CD pipelines, model monitoring, versioning, and lifecycle management for both models and agent-based systems.
Production deployment experience on major cloud platforms (AWS, Azure, or GCP) with demonstrated ability to architect and scale cloud-native ML solutions.
Versatile ML skillset spanning traditional techniques (classification, regression, clustering) and cutting-edge deep learning approaches.
Production-grade generative AI experience deploying and maintaining LLMs and multi-modal models in live environments.
Exceptional analytical capabilities with a track record of solving complex technical problems and thriving in ambiguous, rapidly-evolving situations.
Proficiency with industry-standard ML libraries including PyTorch, TensorFlow, Scikit-learn, NumPy, and Pandas.
Outstanding communication and collaboration skills with ability to translate complex technical concepts for diverse audiences and drive cross-functional alignment.
Success partnering across organizational levels from individual contributors to senior leadership, building trust and delivering results.
Proven ability to influence and lead in matrix organizations where collaboration and relationship-building are essential to achieving outcomes.
Bachelor's degree in Computer Science, Machine Learning, Mathematical Sciences, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience.

Nice To Haves

Experience with vector databases and embedding technologies.
Specialized expertise in AI safety and responsible AI using evaluation tools such as Arize, Langfuse, TruLens, or equivalent platforms for hallucination detection, bias mitigation, and model performance assessment.
Experience with advanced ML techniques including reinforcement learning from human feedback (RLHF), model fine-tuning (LoRA, QLoRA), retrieval-augmented generation (RAG), or model distillation and optimization.
Familiarity with real-time data processing and streaming architectures using technologies such as Apache Kafka, Google Pub/Sub, AWS Kinesis, or Azure Event Hubs for building responsive ML systems.
Master's degree or Ph.D in Artificial Intelligence, Machine Learning, Mathematical Sciences, Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience.

Responsibilities

Develop sophisticated, production-scale AI systems, including multi-step agentic workflows and multi-agent orchestration platforms.
Build tools & agents with advanced capabilities in reasoning, planning, and adaptive tool utilization to address complex business challenges.
Drive complete ownership of the AI/ML lifecycle – encompassing implementation, comprehensive testing, deployment, and continuous operational monitoring – delivering projects on schedule and to specification.
Produce high-quality, maintainable code for model training pipelines, evaluation frameworks, and inference services that meet production standards.
Partner strategically with cross-functional stakeholders including product leaders, data scientists, application teams, vendors, and partners to align on requirements, iterate on solutions, and deliver successful outcomes.
Provide hands-on technical leadership, driving architectural decisions and championing best practices across AI development, LLMOps, quality assurance, and production deployment.
Design and implement responsible AI frameworks including hallucination detection, safety guardrails, comprehensive evaluation systems, and observability infrastructure to ensure model reliability, accuracy, and ethical deployment.
Establish comprehensive evaluation frameworks for Large Language Models and agent-based systems, measuring model quality, task success rates, safety compliance, and operational effectiveness.
Proactively identify and resolve technical blockers that could impact project timelines or deliverables.
Communicate technical strategy and progress to executive leadership and key stakeholders with clarity and confidence.
Engage directly in development and problem-solving, particularly on high-complexity technical challenges, to maintain project velocity and quality.
Drive innovation through research and experimentation with emerging AI technologies and frameworks, evaluating and integrating new capabilities that advance our platform.