About The Position

Meta is seeking a Research Engineering Manager to lead the Evaluations team within Meta Superintelligence Labs. Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. In this leadership role, you will guide a team of research engineers who curate and build the benchmarks for our most advanced AI models, across text, vision, audio, and beyond. You'll partner with world-class researchers and engineers to define the strategic vision for evaluation infrastructure, while ensuring your team delivers high-quality, scalable benchmarks and reinforcement learning environments. This is a technical leadership role requiring research engineering expertise, people management skills, and the experience of driving execution on open-ended machine learning challenges with high reliability. The evaluations your team builds will directly impact the research direction and major model lines within MSL, making engineering reliability, rigor, and scalability paramount. You will excel by maintaining high velocity across your team while adapting to rapidly shifting priorities as we advance the technical research frontier. You'll need to be flexible and adaptive, guiding your team through a wide variety of problems in the evaluations space, from implementing existing benchmarks to developing novel benchmarks and environments. If you are excited about defining the capabilities that drive AI progress, have a track record of building high-performing technical teams, and thrive in fast-paced, high-impact research environments, we encourage you to apply for this exciting leadership opportunity at the core of MSL.

Requirements

  • Bachelor's or Master's degree in Computer Science, Machine Learning, or a related technical field
  • 4+ years of experience in machine learning engineering, machine learning research, or a related technical role
  • 3+ years of experience managing or leading technical teams, including hiring, mentoring, and performance management
  • Proficiency in Python and experience with ML frameworks such as PyTorch
  • Proven track record of leading medium to large-scale technical projects from conception to deployment
  • Demonstrated experience balancing hands-on technical work with people management and strategic planning
  • Clear communication and experience influencing cross-functional stakeholders

Nice To Haves

  • Publications at peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to language model evaluation, benchmarking, or deep learning
  • Hands-on experience with language model post-training and deep learning systems, or building reinforcement learning environments
  • Experience implementing or developing evaluation benchmarks for large language models and multimodal models (e.g., vision-language, audio, video)
  • Experience building and scaling large-scale distributed systems and data pipelines
  • Familiarity with language model evaluation frameworks and metrics
  • Track record of open-source contributions to ML evaluation tools or benchmarks
  • Experience managing teams in fast-paced research or startup environments
  • PhD in Computer Science, Machine Learning, or related field

Responsibilities

  • Build, mentor, and grow a team of research engineers and scientists focused on evaluation infrastructure and benchmarking
  • Conduct performance reviews, career development conversations, and provide technical mentorship to team members
  • Foster a culture of engineering excellence, research rigor, and rapid iteration within the team
  • Partner with recruiting to hire world-class research engineering talent
  • Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development
  • Oversee the development and implementation of evaluation environments, including environments for novel model capabilities and modalities
  • Establish partnerships with external data vendors to source and prepare high-quality evaluation datasets
  • Influence the technical roadmap for evaluation infrastructure in collaboration with MSL Infra team
  • Translate the technical vision of research scientists into actionable engineering plans and execution strategies
  • Partner with research scientists, product teams, and other engineering teams to align evaluation priorities with organizational goals
  • Build robust, reusable evaluation pipelines that scale across multiple model lines and product areas
  • Drive the development of evaluation tooling that measures the quality and reliability of evaluation suites
  • Communicate technical progress, challenges, and strategic decisions to leadership
  • Maintain technical credibility through hands-on contributions to critical evaluation projects (20-30% of time)
  • Review code, provide technical guidance, and unblock complex technical challenges
  • Set engineering standards and best practices for the team
  • Follow best software engineering practices including version control, testing, code review, and system design

Benefits

  • bonus
  • equity
  • benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service