Senior Infrastructure & Hardware Engineer

Zoom•San Jose, CA

58d•Hybrid

About The Position

As a Senior Data Center Infrastructure & Hardware Engineer, you’ll take ownership of designing, deploying, and maintaining the high performance compute (HPC) and AI server environments. This is a hands-on technically demanding role where you’ll be at the forefront of data center innovation—bridging hardware engineering, infrastructure design, and operational excellence.

Requirements

Provide experience specifically with Dell PowerEdge, DCC, or similar enterprise server platforms.
System-level understanding of server products including physical, functional, logical, mechanical, electrical, software, thermal and cooling.
Understand high speed transceiver interfaces like PCIe (physical, DLL and transaction layer), and memory interfaces. This includes: DDR4/DDR5 as well as the requirements to test them.
Develop large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage or hardware architecture.
Experience in a data center / field service / hardware support role working with enterprise servers or converged/ hyper converged systems.
Provide hands‑ on skills in rack‑and‑stack, cabling, and power connectivity (single‑ phase and three‑phase AC, PDUs, dual‑cord redundancy).
Working knowledge of TCP/IP networking, including VLANs, basic switch configuration concepts, and OOB management networks.
Working knowledge of server BIOS/firmware configuration, hardware diagnostics, and common fault isolation techniques.
Administer lab equipment like scopes, BERTs and analyzers.
experience in collecting and organizing data to reproduce results and explain what happened.
Ability to use scripting and automation to execute test algorithms and analyze results.
Able to read and interpret rack elevations, power and network diagrams, and implementation run books.
Excellent vendor‑facing communication skills and the ability to coordinate with external & internal technical stakeholders.
Willingness and ability to travel to customer sites, sometimes on short notice, and to work in raised‑floor and high‑density data center environments.

Responsibilities

Defining and influencing technology vision and roadmap for a HPC (high-performance compute) and AI servers w/ GPU acceleration.
Evaluate emerging data center infrastructure technologies and standards - Provide architecture and design recommendations to DevOps teams and Leadership.
Leading technical discovery and collaborate with technical stakeholders: workloads, architectures, integration requirements, and success criteria.
Translate complex application needs into clear requirements and solution proposals.
Representing InfraOps, joint-lead technical steering for major programs (e.g., next‑gen platform, AI/advanced workloads, large OEM/ODM partnerships).
Review and validate data center readiness including rack layout. This includes: power and cooling capacity, hot/cold aisle design, space for planned DCC (direct connect cable), and server deployments.
Assess and confirm AC power and PDU design (voltage, phase, breaker sizing, redundancy) in line with data center best practices prior to connecting any hardware.
Confirming availability of required network ports, VLANs, management networks (e.g., iDRAC / OOB), and cabling paths.
Deploy third-party integrated server and associated DCC infrastructure, including PDU energization, and grounding.
Install and rack AI servers and associated DCC infrastructure in third‑party racks, including rails, mounting, cable management, and grounding.
Connect redundant power feeds to ensure each server’s PSUs/power zones are landed on independent PDUs or power sources, preserving high availability.
Applying and verifying line‑cord identification labels, ensuring clear mapping to PDUs and breaker positions for ongoing operations and support.
Collaborate with Network Engineering and System Administrations teams to develop network cables designs from servers to top‑of‑rack and core switches according to design diagrams.
Lead and supervise remote-hands services and third-party vendors during site deployments.
Assisting network and systems kickstart procedure - iDRAC / OOB management (IP addressing, DNS, routing, access control) and confirm reachability from designated management networks.
Leading and training junior staff on BIOS and firmware settings for target solutions (e.g., virtualization, secure boot, boot order) consistent with Zoom approved configurations.
Power on and run initial diagnostics to verify hardware health, inventory, and proper reporting into Zoom asset management tools (NetBox, CMDB, etc.).
Validate power redundancy and failover behavior (e.g., PSU or PDU failure scenarios where appropriate) and confirm design objectives are met.
Producing and reviewing an as‑ built document (rack elevations, serial numbers, PDU/breaker mappings, management IPs, versions, deviations from design) and obtain stakeholders sign‑ off.
Lead, support and serve as senior-escalation to support DCOps w/ onsite break/fix support for DCC and server hardware, including FRU/CRU replacement and post‑ repair validation.
Collaborate closely with hardware technical support, engineering, and lab teams on complex issues, escalations, and solution validation scenarios.
Influence standards, design reviews, and technical governance for data center infrastructure and compute hardware technologies.