Core Weave-posted about 2 months ago
Full-time • Mid Level
Hybrid • New York, NY
501-1,000 employees
Professional, Scientific, and Technical Services

We are seeking a highly motivated and experienced Infrastructure and Node Delivery Technical Program Manager (TPM) to join our dynamic team focused on the New Product Introduction (NPI) of our next-generation GPU hardware provisioning and delivery. In this pivotal role, you will be instrumental in leading the cross-functional efforts from concept to mass production, ensuring the timely, high-quality, and cost-effective delivery of our innovative GPU products that power the future of AI. You will navigate the complexities of hardware development cycles, collaborate with world-class engineers, external vendors, and influence strategic decisions to bring groundbreaking technology to market.

  • Production Readiness: Ensure infrastructure and system software are production-ready for new hardware and compute platforms. Engage in technical discussions with engineering teams, challenge assumptions, and contribute to problem-solving.
  • Program Leadership: Drive end-to-end programs spanning GPU provisioning, at-scale deployments, Fleet NPI readiness, and vendor management. Anticipate and identify potential risks, proactively develop mitigation strategies, and drive timely resolution of technical and logistical challenges.
  • Reliability & SLA Management: Coordinate with hardware compute engineering, Fleet teams, and external vendors to maintain service reliability, enforce SLAs, and lead incident response efforts.
  • Observability & Telemetry: Partner with engineering teams to improve monitoring, telemetry, and fleet observability for proactive performance management.
  • Metrics & Insights: Define and track metrics around GPU fleet health, performance, and reliability.
  • Postmortems & Continuous Improvement: Run post-incident reviews and drive action items that enhance system reliability and prevent regressions.
  • Internal Enablement: Collaborate with internal customers to collect feedback, enable adoption of core infrastructure platforms, and refine onboarding experiences (e.g., K8s Core Interface, CKS, SUNK) for hardware compute NPIs.
  • Cross-functional Coordination: Work closely with Product, Infrastructure, Platform Engineering, Vendor, and Customer Experiences to align on roadmap priorities and customer delivery timelines.
  • Effective Communication: Communicate program status, risks, and critical decisions to senior leadership and executive stakeholders with clarity and conciseness. Foster a culture of transparency, collaboration, and continuous improvement within the NPI process.
  • Bachelor's degree in Electrical Engineering, Computer Engineering, or a related technical field.
  • 10+ years of experience in technical program management in GPU provisioning, fleet management, or large-scale compute infrastructure.
  • Background in observability, monitoring, or telemetry systems (e.g., Prometheus, Grafana, OpenTelemetry).
  • Hands-on experience coordinating NPI or GTM readiness for compute products.
  • Technical understanding of system software orchestration and hardware/software integration.
  • Solid understanding of hardware and fleet development lifecycles.
  • Proven ability to lead cross-functional teams, influence without direct authority, and drive consensus in a fast-paced environment.
  • Exceptional communication, interpersonal, and presentation skills.
  • Proficiency in program management tools (e.g., Jira, Confluence, Sheet).
  • Master's degree in Engineering or an MBA.
  • Experience with GPU or other high-performance compute architecture NPI.
  • Experience working with international manufacturing partners and supply chains.
  • Experience with agile methodologies in a hardware and software development context.
  • Medical, dental, and vision insurance - 100% paid for by CoreWeave
  • Company-paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
  • Ability to Participate in Employee Stock Purchase Program (ESPP)
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our office and data center locations
  • A casual work environment
  • A work culture focused on innovative disruption
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service