Engineering Manager, ML Infrastructure - Moveworks

ServiceNowMountain View, CA
8hOnsite

About The Position

As the Engineering Manager for the Machine Learning Infrastructure team, you will spearhead the development of the cutting-edge platform that powers Moveworks' conversational AI. This role is absolutely critical to the long-term scalability of our core AI product and, ultimately, the company. Your primary mission is to lead a team of talented engineers in building, optimizing, and scaling the end-to-end systems for the entire ML/LLM lifecycle. This includes our infrastructure for distributed training and inference, model evaluation frameworks, and LLM latency optimization. You will guide the team's technical vision, balancing the operational demands of our core ML infrastructure with forward-looking research to build the next generation of LLMs using cutting-edge generative AI. The frameworks your team builds serve as the foundation for all ML models in production, serving hundreds of millions of enterprise employees. Your contributions will be instrumental in shaping the Moveworks Enterprise Copilot platform and defining the future of AI-driven employee services.

Requirements

  • A Master's or Ph.D. in Computer Science, Machine Learning, or a related field.
  • 5+ years of industry experience with a proven track record of leading or managing high-performing machine learning or infrastructure teams.
  • Deep technical expertise in designing, building, and scaling end-to-end machine learning systems in production environments.
  • Strong command of Python and experience with performant languages such as C++ or GoLang.
  • Extensive experience with deep learning frameworks like PyTorch or Hugging Face.
  • Hands-on experience with modern LLM infrastructure, including distributed training frameworks (e.g., Deepspeed) and inference/serving frameworks (e.g., vLLM, TensorRT-LLM, Kubernetes).
  • A strategic mindset with experience balancing the demands of operating robust, scalable infrastructure with the need for forward-looking research and development.
  • Excellent communication and collaboration skills, with experience working cross-functionally to deliver complex projects.

Responsibilities

  • Lead, Mentor, and Grow a world-class team of ML and Systems Engineers, fostering a culture of innovation, ownership, and operational excellence that aligns with Moveworks' core principles.
  • Own the Technical Vision and roadmap for the end-to-end ML platform that powers the entire lifecycle—from data synthesis and distributed training to ultra-low-latency inference and serving—for hundreds of production models, including our proprietary MoveLM series.
  • Drive the Strategy for model performance and efficiency, making critical architectural decisions to optimize our GPU infrastructure for latency, throughput, and cost at massive scale.
  • Partner with Leaders across agentic platform, search platform, product engineering, and core infrastructure teams to define and deliver the foundational infrastructure that will power the next generation of agentic AI experiences.
  • Champion a Product Mindset for your platform, building powerful abstractions and tools that accelerate the velocity of machine learning engineers and researchers across the organization.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Manager

Education Level

Ph.D. or professional degree

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service