Senior Technical Program Manager

MicrosoftRedmond, WA
1d

About The Position

We are seeking a Senior Technical Program Manager (L64) to join the AI Delivery Inferencing Portfolio team, a critical enabler of Microsoft’s strategy to rapidly launch and scale next-generation GPU infrastructure supporting AI workloads. This role owns a regional portfolio of feasibility assessment programs for GPU deployments and plays a pivotal role in accelerating the release of high‑quality GPU cluster designs (CLDs) at scale. This TPM will operate as part of a senior program management team responsible for driving end‑to-end readiness across a portfolio of accelerated, purpose‑built AI datacenters, enabling CO+I to meet aggressive capacity, quality, and time‑to‑market objectives. Success in this role requires comfort operating under ambiguity, deep technical curiosity, strong orchestration skills, and the ability to create clarity and momentum across highly interdependent engineering and delivery teams.

Requirements

  • Bachelor's Degree AND 4+ years experience in engineering, product/technical program management, data analysis, or product development OR equivalent experience.
  • 2+ years of experience managing cross-functional and/or cross-team projects.

Nice To Haves

  • 7+ years of experience in technical program management, infrastructure delivery, or large‑scale systems programs.
  • Demonstrated experience managing complex, cross‑functional portfolios with significant technical and operational dependencies.
  • Strong technical aptitude, with the ability to quickly learn and reason about datacenter infrastructure, GPU platforms, power and cooling systems, and deployment constraints.
  • Proven ability to design scalable processes, create clarity under ambiguity, and drive execution across diverse stakeholder groups.
  • Exceptional communication skills, with experience influencing without authority at senior levels.
  • Experience supporting AI, High Performance Compute, or related accelerated infrastructure programs.
  • Familiarity with datacenter design, capacity planning, or technical feasibility assessment processes.
  • Experience driving automation or tooling to improve operational visibility and execution quality.
  • Track record of reducing operational risk, improving quality, and accelerating time to market in high‑stakes environments.

Responsibilities

  • Own Regional GPU Feasibility Portfolio
  • Lead a regional portfolio of GPU feasibility assessment programs, ensuring consistent, predictable, and high‑quality execution across multiple parallel initiatives.
  • Drive early feasibility validation across power, cooling, space, network, colo sequencing, and deployment constraints to improve execution signal quality and reduce late-stage risk.
  • Establish and maintain a standardized feasibility pre‑check framework that enables faster decision‑making and minimizes canceled or reworked execute signals.
  • Accelerate GPU Cluster Design Readiness
  • Partner across organizations to accelerate cluster layout design creation and evolution for next-generation GPU platforms.
  • Create and maintain runbooks, templates, and governance mechanisms that serve as a single source of truth for cluster design workflows, change management, and handoffs.
  • Drive clarity and predictability in change management, balancing speed with quality and downstream execution impact.
  • Cross‑Team Orchestration & Governance
  • Act as a connective tissue across partner teams, aligning priorities, dependencies, and execution signals.
  • Design and operationalize lightweight but robust governance mechanisms to manage change, dependencies, and risk across complex, cross-org programs.
  • Partner with engineering to enable automation, tooling, and telemetry that improves visibility, tracking, and accountability across the feasibility and design lifecycle.
  • Drive Datacenter Readiness for AI Workloads
  • Contribute as a senior member of the AI Delivery program management team focused on readiness for accelerated, purpose-built AI datacenters.
  • Align feasibility outcomes with capacity planning and execution readiness to ensure on‑time, high-confidence launches of GPU capacity.
  • Identify systemic bottlenecks and drive structural improvements that scale beyond a single region or program.
  • Leadership, Culture, and Influence
  • Operate as a trusted partner and thought leader for stakeholders across disciplines and organizations.
  • Model Microsoft’s leadership principles by creating clarity in ambiguity, generating energy through collaboration, and delivering results with high standards.
  • Mentor and influence other program managers, raising the overall bar for program discipline, technical depth, and operational excellence.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service