About The Position

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion. AWS Infrastructure Services (AIS) is looking for a Senior Manager to lead GenAI/ML Infrastructure Demand & Supportability Planning. Our team is responsible for AWS's global GenAI and Machine Learning infrastructure supply and demand strategy, spanning 105 availability zones, 33 geographical regions, and serving over 245 countries and territories. The team drives compute and datacenter supply planning, ML infrastructure capacity delivery, and GenAI supportability to ensure scalable cloud services for millions of AWS customers. In other words, we're the people who keep the cloud running for the next generation of AI workloads. As the Senior Manager of GenAI/ML Infrastructure Demand Planning, you will be responsible for leading teams that drive the S&OP (Sales and Operations Planning) process specifically for AWS's Machine Learning and GenAI infrastructure. You will work closely with EC2 product/forecasting teams, Hardware/Network/DC Engineering, Material Planning, Data Center planning, and Capacity Delivery teams to build S&OP plans that drive short-term and long-term ML/GenAI Supply and Demand strategy. You will be responsible for the demand signal provided to AIS for Machine Learning racks and GenAI infrastructure, ensuring alignment with financial plans, sales and growth projections, agreed upon demand levers, transitions, NPI roadmaps, and prior commits and plan of records. You will outline supply chain automation roadmaps to scale capacity delivery of Machine Learning infrastructure and guide automation teams on implementing the systems required for various elements of the demand planning process. The GenAI/ML Demand and S&OP Team is responsible for the 13-103 week Demand Plan of Record (POR) for ML Server material planning and the 0-10 year Demand POR for ML Data Center planning. You are an experienced leader who will have demonstrated leading large cross-functional and cross-organizational projects in the ML/AI infrastructure space. An ability to take large, technically complex projects and break them down into manageable pieces, develop actionable plans, and successfully deliver them are expected. This role is inherently cross-functional and requires the ability to think big and collaborate with others as you work closely with teams across AWS Infrastructure and AWS Service teams. The shifting power and permitting constraints, combined with the rapid evolution of GenAI workloads and the critical nature of the role, requires someone who maintains momentum, clarity of vision, and adaptability while communicating effectively across a diverse set of customers, partners, and leadership. The Sr. Manager must effectively distinguish between one/two way door decisions. Decisions driven by the Sr. Manager create significant impact to the AWS Infrastructure organization and all of our customers, so excellent judgement (based on domain and technical expertise) is required in managing complex problems, tough trade-offs, proposals, and escalations. Communication with executive audiences is a regular occurrence. High judgment, negotiation skills, ability to influence without authority, analytical talent, technical aptitude, and leadership to collaborate with a diverse set of stakeholders across multiple time zones, manage capital budgets, eliminate non-value-add activity, design solutions, remove roadblocks, and find creative ways to accelerate ML infrastructure delivery are therefore essential for success in this role.

Requirements

  • Bachelor's degree or equivalent in a related technical field
  • 10+ years of relevant experience in positions that require supply and demand planning including ability to manage large volume of data to support other functions that plan sourcing, production, and delivery requirements
  • 6+ years of team management experience
  • 5+ years of technical product or program management experience and experience working directly with software engineering teams experience
  • Experience owning program strategy across cross functional teams, end to end delivery, and communicating results to senior leadership

Nice To Haves

  • Strong analytical and quantitative skills with the ability to use data and metrics to back up assumptions, evaluate outcomes, and challenge conventional wisdom
  • Demonstrated ability to influence with or without formal authority
  • Excellent written and verbal communication skills for both technical and non-technical audiences
  • 5+ years of project management disciplines including scope, schedule, budget, quality, along with risk and critical path management experience
  • Experience managing projects across cross functional teams, building sustainable processes and coordinating release schedules
  • Experience defining KPI's/SLA's used to drive multi-million dollar businesses and reporting to senior leadership
  • Experience with ML/AI infrastructure planning and capacity management

Responsibilities

  • Drive the S&OP process for AWS GenAI/ML Infrastructure and influence AWS's ML Supply and Demand strategy
  • Partner with EC2 product/forecasting, Hardware/Network/DC Engineering, DC ops teams to capture system and operational requirements for ML infrastructure
  • Outline supply chain automation roadmap to scale capacity delivery of Machine Learning and GenAI infrastructure
  • Guide automation teams on implementing the systems required for the various elements of the ML demand planning process, while also directly managing/dialing in new programs until such a time as it makes sense to automate
  • Partner with SDMs/PEs/Ops teams to evaluate pros and cons of alternate solutions in order to finalize high level system solution and operational processes for ML capacity delivery
  • Manage end to end implementation of new supply chain automation capabilities by partnering with software development teams
  • Create product strategy and feature road map via Amazon Working Backwards documents
  • Provide thought-leadership to Amazon leadership and business partners, delivering solutions to strategic problems related to ML/GenAI infrastructure scaling
  • Communicate performance against goals and objectives through narratives and business reviews, including bi-weekly leadership updates on capacity delivery status of Machine Learning racks
  • Lead broader initiatives, as assigned, to further advance the effectiveness of the organization
  • Drive operational initiatives to scale capacity delivery of Machine Learning racks through collaboration with EC2 capacity planning, DC Infra planning, EC2 product, and operations teams

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service