Compute Server Firmware Test Engineer

Celestica International LPAustin, TX
Onsite

About The Position

The Server Compute CPU & GPU Firmware Test Engineer will play a pivotal role in the design, development, and execution of comprehensive test strategies for our AI data center's server infrastructure. This position requires experience with server architectures, enterprise storage systems, networking, and an understanding of the unique performance and reliability demands of AI/ML workloads. The ideal candidate will be a hands-on engineer, capable of driving test automation, and collaborating across engineering teams to deliver robust and high-performing solutions.

Requirements

  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related technical field.
  • 3+ years of experience in hardware and/or software testing, preferably with at least 1 year focused on enterprise-level storage and server systems.
  • Strong understanding of server architectures (x86, ARM, GPU servers), CPU/memory subsystems, PCIe, power management, and Baseband Management Controllers (BMC) functionality.
  • Proficiency in scripting languages (e.g., Python, Bash) for test automation and data analysis.
  • Experience with Linux operating systems (e.g., Ubuntu, CentOS, RHEL) and command-line tools.
  • Experience with test methodologies such as performance testing, reliability testing, stress testing, and fault injection.
  • Excellent problem-solving, analytical, and debugging skills.
  • Strong communication and interpersonal skills, with the ability to collaborate effectively across diverse teams.

Nice To Haves

  • Familiarity with OCP (Open Compute Project)
  • Experience with cloud environments (AWS, Azure, GCP) and virtualization technologies.
  • Knowledge of containerization technologies (Docker, Kubernetes).
  • Familiarity with AI/ML frameworks (e.g., TensorFlow, PyTorch) and their infrastructure requirements.
  • Experience with performance profiling tools (e.g., fio, Iometer, Perf, VTune).
  • Contributions to open-source projects related to storage, servers, or testing.
  • Certifications in relevant technologies (e.g., NetApp, Dell EMC, HPE, NVIDIA).

Responsibilities

  • Define, develop, and implement comprehensive test plans and strategies for all storage and server hardware, firmware, and software components within the AI data center environment.
  • Lead the test team in designing, executing, and analyzing complex test cases, including functional, performance, reliability, stress, and endurance testing.
  • Design and implement automated test frameworks and scripts using languages like Python, Go, or similar, to improve efficiency and coverage of testing.
  • Conduct in-depth performance analysis and bottleneck identification for server platforms (e.g., CPU, GPU, memory, PCIe, networking), OpenBMC interfaces/features and storage systems (e.g., NVMe, SSD, HDD arrays, distributed storage, SAN/NAS). This includes debugging issues related to BMC functionality and its interaction with server hardware.
  • Develop and maintain robust testbeds and infrastructure for continuous integration and validation.
  • Utilize open-source and commercial test tools relevant to server, OpenBMC and storage validation.
  • Collaborate closely with hardware design, software development, infrastructure, and AI/ML engineering teams to understand requirements and integrate testing throughout the product lifecycle.
  • Communicate test progress, results, and critical issues effectively to stakeholders, including executive leadership.
  • Develop specialized test methodologies to validate performance and reliability under heavy AI/ML workloads (e.g., large model training, inference at scale, data ingestion).
  • Understand and test the interactions between GPU-accelerated computing, high-speed networking, and storage systems.

Benefits

  • Equal employment opportunity policy prohibits discrimination based on race, color, creed, religion, national origin, gender, sexual orientation, gender identity, age, marital status, veteran or disability status, or other characteristics protected by law.
  • Retaliation against a person who files a charge of discrimination, participates in a discrimination proceeding, or otherwise opposes an unlawful employment practice will not be tolerated.
  • All information will be kept confidential according to EEO guidelines.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service