About The Position

We are seeking a principled and dynamic Lead Systems and Validation Engineer to drive end-to-end system validation for AI compute blade and rack platforms. This high-visibility role focuses on first silicon bring-up, system enablement, and post-silicon validation across hardware, firmware, and software domains. You will lead validation strategy, execution, and debug while collaborating closely with cross-functional engineering teams to deliver high-quality, industry-leading AI systems.

Requirements

  • Strong analytical and problem-solving skills with high attention to detail.
  • Extensive experience in system validation encompassing initial silicon integration and HW/FW/SW validation.
  • Strong understanding of industry-standard interfaces such as PCIe and CXL.
  • Experience enabling storage and networking systems in lab environments for end-to-end validation.
  • Deep knowledge of ARM or x86 architectures, SoC design, memory, RAS, and power management.
  • Experience with system-level debug, operating systems, BIOS, and device drivers.
  • Strong experience with Linux, Windows, and virtualization technologies (VMware, KVM, Hyper-V).
  • Excellent communication, organization, and cross-functional collaboration skills.
  • Ability to lead multiple workstreams and deliver under tight timelines.
  • Experience in technical program management and driving validation initiatives.

Nice To Haves

  • Master’s or PhD in Electrical Engineering, Computer Engineering, or related field.
  • Demonstrated ability working on complex system validation and debug in data center or rack-scale environments.
  • Experience designing or validating AI/ML rack-scale systems.
  • Knowledge of hardware development best practices and industry standards.
  • Familiarity with emerging AI and data center infrastructure technologies.
  • Experience collaborating with ODM partners globally.

Responsibilities

  • Lead system enablement including first silicon bring-up and integration of firmware and platform components to meet system architecture specifications.
  • Develop validation methodologies, lab hardware infrastructure, system software capabilities, and debug tools for blade and rack-level validation.
  • Triage and debug issues across bring-up, post-silicon validation, and production phases ensuring timely resolution with high quality.
  • Define and own validation test plans, develop test cases, and drive execution across domains such as CPU, GPU, memory, HBM, and IO.
  • Drive innovation in validation through development of tools, scripts, and improved methodologies to enhance coverage, efficiency, and quality.
  • Enable comprehensive system verification by integrating storage, networking, and infrastructure capabilities within lab environments.
  • Collaborate with multi-functional teams including system architecture, embedded software, hardware build, and ODM partners.
  • Provide technical leadership and mentorship, fostering engineering perfection and continuous improvement.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service