Meta-posted 1 day ago
Full-time • Mid Level
Menlo Park, CA

Meta’s Core Infrastructure team seeks a Technical Program Manager (TPM) to lead complex, large-scale projects focused on advancing language model scaling. In this key position, you will collaborate across engineering, hardware, data center, research, and product teams to design, build, and scale foundational hardware, software systems, and tools that support Meta’s AI innovation. You will be responsible for driving the end-to-end integration of new AI hardware and core infra stack, from initial design validation of our software stack through production deployment. This includes developing and refining repeatable frameworks for efficient onboarding, ensuring robust and predictable execution, and proactively resolving technical and organizational challenges to maintain project momentum. You will use your problem-solving, technical acumen, and business insight to streamline onboarding of new AI hardware platforms into Meta’s suite of core infrastructure services. You will communicate transparently across all levels, motivate multidisciplinary teams, and champion best practices to deliver impactful outcomes that advance Meta’s infrastructure.

  • Establish and lead effective program teams to ensure alignment and achieve common objectives
  • Work closely with engineering, data center, hardware and business stakeholders to define program requirements, prioritize initiatives, and establish scope, including shaping the roadmap and long-term strategy for partner teams
  • Create and implement communication strategies to proactively share program status, challenges, and risks with stakeholders
  • Drive successful outcomes by actively managing cross-functional dependencies, mitigating risks, and adjusting scope, timeline, and resources as needed
  • Collaborate with cross-functional teams to lead the end-to-end lifecycle of programs, including technical analysis, design, development, testing, implementation, and post-launch support
  • Establish and track key metrics, quality benchmarks, and performance indicators to drive accountability and ensure effective cross-functional execution of program deliverables
  • Anticipate and evaluate complex, long-term infrastructure challenges in close partnership with engineering leaders and key stakeholders
  • Drive product strategy to support and align with key company initiatives
  • Lead process improvements across internal and external teams, streamlining workflows and reducing manual effort through automation
  • Bachelor of Science in Electrical Engineering, Computer Science, Mechanical Engineering, or a related technical field, or equivalent experience
  • 12+ years of experience in software engineering, hardware engineering, systems engineering, or technical product/program management
  • Knowledge of software and hardware development for large scale hardware readiness, including end-to-end product development processes
  • Excel at clearly communicating complex technical investments in a simple and understandable manner
  • Experience delivering complex technology programs and products from inception through to successful delivery
  • Knowledge of understanding user needs, gathering requirements, and defining project scope
  • Experience working under your own initiative, across multiple teams, demonstrating critical thinking and providing thought leadership in ambiguous spaces
  • Experience defining and optimizing engineering processes at scale
  • Excel at building cross-functional relationships, thrive amid complex challenges, excel at clearly communicating complex technical investments in a simple and understandable manner
  • Experience in analytical thinking and problem-solving for large-scale systems
  • Experience building work relationships across multi-disciplinary teams and with partners in different time zones
  • Experience defining strategic direction and identifying new opportunities for impact across products, platforms, and programs
  • Experience communicating at the executive level and influencing leadership and technical management teams to drive the development of systems, solutions, and products
  • Knowledge of Large Language Model and machine learning, and scaling distributed systems
  • Demonstrated experience of identifying new opportunities for the larger organization and influencing the appropriate stakeholders
  • Proven commitment to scale infrastructure for large scale AI distributed compute systems
  • Knowledge of software and hardware development for large scale system readiness
  • Excel at clearly communicating complex technical investments in a simple and understandable manner
  • bonus
  • equity
  • benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service