About The Position

Join NVIDIA as a Solution Architect on the Infrastructure Specialists team. Help redefine deep learning, data analytics, and power data centers worldwide using NVIDIA products. Collaborate on building the world's largest and fastest AI Factories and supercomputers. We are seeking a candidate who can lead the planning and deployment of large scale AI data centers, focusing on infrastructure buildout including power and cooling systems, telemetry and control systems, and large scale design, construction and delivery processes. In this role, your main focus will be to support customers in the areas of planning, design, construction, and deployment of large scale AI factories. You will be a part of the team building capabilities to design, construct and deliver large AI factories based on NVIDIA's reference designs. This includes architectural systems, power distribution, cooling systems, integration of telemetry and control systems, and all other physical infrastructure. Collaboration with product and engineering teams, customers, and the partner/provider ecosystem will be crucial to achieving successful deployments.

Requirements

  • Bachelor's degree or equivalent experience in Engineering, or a related field. Advanced degree or equivalent experience or relevant certifications are desirable.
  • We need an expert professional with a background in multiple aspects of infrastructure delivery, preferably of hyperscale data centers. The ideal candidate will have at least 10 years of experience, preferably in sophisticated, high-density AI/HPC data centers.
  • Proven experience in data center engineering, operations, deployment and/or infrastructure management roles, focusing on large-scale data center deployments.
  • Strong technical knowledge and experience in data center systems and processes- power distribution, liquid cooling, rack/server chassis, and cabling.
  • Proven technical and project leadership under fluid situations, and ability to adapt to change
  • Strong communication at both ground execution as well as executive levels, internally and with customers
  • Excellent analytical, problem-solving, communication and decision-making skills, keen attention to detail, and a dedication to quality.
  • Strong record of excellent partnership and putting the mission's success first.
  • Coordination & Time Management – proficient at planning, scheduling, and coordinating tasks related to the job to accomplish objectives within or ahead of designated time frames.
  • Able to travel (25%).

Nice To Haves

  • Experience in hyperscale data center deployment, operations process, safety, and security measures.
  • Solid understanding of the whole data center Infrastructure stack.
  • Outstanding social skills.

Responsibilities

  • NVIS Data Center deployment planning: Collaborate with product and engineering teams to understand NVIDIA’s reference architectures for data center infrastructure including power distribution, cooling systems, controls and monitoring, and network/cabling architecture. Support customers and partners in quickly implementing this architecture into advanced and reliable data center designs.
  • Building process capabilities: Collaborate across the org to build processes, partner relationships and workflows to deliver and deploy large AI factories at speed of light (SOL).
  • Design and construction oversight: Review and appraise customers' and partners' infrastructure design plans, verifying their compliance with NVIDIA reference architecture, industry standards, and regulatory requirements. Deliver guidance, expertise and suggestions to optimize performance, scalability, and cost-effectiveness. Ensure alignment with our customers and partners on reference architecture, guidelines and processes to make their deployments successful. Assess the operational efficiency, reliability, and readiness of data center infrastructure components before deploying AI/HPC clusters. Develop and implement comprehensive audit plans and conduct pre-deployment audits to identify potential issues, risks, and areas for improvement.
  • Partner and vendor ecosystem: Develop and sustain a strong ecosystem of manufacturers, service providers and partners as needed, to ensure customers can deploy NVIDIA solutions rapidly and reliably. Be the key liaison for customers and partners on matters of data center infrastructure. Act as the NVIS mentor providing guidance, mentorship, and support to ensure the team's success in their respective roles.
  • Quality Assurance: Implement and make quality assurance processes to ensure that deployments meet established specifications and performance benchmarks. Conduct detailed bring-up, testing, and commissioning to validate the functionality and reliability of infrastructure components.
  • Continuous Improvement: Drive continuous improvement initiatives to improve data center infrastructure reliability, resilience, and sustainability. Find opportunities to streamline processes, automate repetitive tasks, and apply new technologies to optimize infrastructure operations.
  • Collaboration and Communication: Collaborate and communicate across internal teams, external vendors, and customers to facilitate the flawless integration of data center infrastructure solutions. Serve as a domain authority and point of contact for infrastructure-related inquiries and critical issues.

Benefits

  • You will also be eligible for equity and benefits.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service