Failure Analysis Engineer

Hyve SolutionsFremont, CA

About The Position

At Hyve Solutions, our mission is to empower customers, business partners, and employees to achieve success through shared goals, innovative strategies, and cutting-edge technology solutions. As a leader in data center solutions, we specialize in designing, manufacturing, and delivering custom Server, Storage, and Networking Solutions for the world’s largest Cloud, Social Media, and Enterprise companies. Position Summary We are seeking a highly analytical Failure Analysis Engineer to support the investigation of hardware failures in rack systems, server platforms, and data center infrastructure products. This role is responsible for diagnosing complex electrical, mechanical, thermal, and system-level failures throughout the product lifecycle, including manufacturing, qualification, customer returns, and field reliability.

Requirements

  • Bachelor’s degree in E lectrical E ngineering , Computer Engineering, Mechanical Engineering, or a related engineering discipline.
  • 3 + years of experience in failure analysis, hardware validation, quality engineering, reliability engineering, or manufacturing engineering.
  • Experience supporting enterprise servers, rack systems, storage platforms, networking equipment, or data center infrastructure.
  • Strong understanding of server architecture including: CPUs , GPUs , Memory (DDR4/DDR5) , PCIe architecture , NVMe storage , Ethernet networking , BMC/IPMI management and Power distribution systems , but not limited to,
  • Experience troubleshooting complex hardware failures at the system and board level.
  • Knowledge of schematic review and hardware debugging techniques.
  • Ability to interpret manufacturing and test logs to identify failure mechanisms.
  • Excellent analytical, communication, and technical documentation skills.

Nice To Haves

  • Key Competencies Strong analytical and troubleshooting skills Cross-functional collaboration Data-driven decision making Technical writing and presentation Continuous improvement mindset Ability to manage multiple high-priority investigations in a fast-paced environment

Responsibilities

  • Perform failure isolation at the component, subsystem, and rack level , as well as root cause analysis on failures involving server systems, rack-level assemblies, storage platforms, networking hardware, and associated components.
  • Investigate failures from manufacturing, system integration, reliability testing, customer returns (RMA), and field deployments.
  • Analyze electrical, mechanical, thermal, and firmware-related failures using structured troubleshooting methodologies.
  • Utilize laboratory equipment including oscilloscopes, digital multimeters, power analyzers, thermal cameras, logic analyzers, X-ray systems, optical microscopes, and environmental test equipment.
  • Conduct board-level debugging of server motherboards, backplanes, power distribution boards (PDBs), power supplies, GPU modules, CPUs, DIMMs, NICs, storage devices, and PCIe components.
  • Work closely with Design Engineering, Manufacturing Engineering, Quality, Reliability, Supplier Quality, Test Engineering, and Operations to identify corrective actions.
  • Lead Root Cause Analysis (RCA) activities using 8D, 5-Why, Fishbone Diagram, Fault Tree Analysis (FTA), and Failure Modes and Effects Analysis (FMEA).
  • Develop and publish detailed failure analysis reports, including technical findings, corrective actions, and preventive recommendations.
  • Support reliability qualification testing, HALT/HASS, thermal validation, vibration testing, and environmental stress testing.
  • Identify recurring failure trends through statistical analysis and recommend design or process improvements.
  • Drive corrective and preventive actions (CAPA) to improve product reliability and manufacturing yield.
  • Collaborate with suppliers to investigate component-level failures and improve incoming material quality.
  • Support customer escalations by providing technical expertise during failure investigations.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service