Sr. Staff Reliability Engineer

ZT SystemsSecaucus, NJ
1dOnsite

About The Position

About the Role Our reliability team is responsible to evaluate, develop, design, and implement software and product reliability test regimens to ensure ZT products of the highest quality are delivered to our customers. We are looking for a passionate Sr. Staff Reliability Engineer with exceptional knowledge/ experience developing and manufacturing scalable infrastructures. You will be working with the latest technologies that go into building a hyperscale cloud services. What You'll Do The successful candidate will be responsible for using Design for Reliability principles to ensure the cloud hardware developed and delivered to data centers meet specified use-conditions and stresses to assure its design intent. Act as the internal consultant on all reliability matters and interface with program management, vendors, and design engineering (as necessary) on key reliability programs/issues; supporting the Software/script development needs of the reliability team. This will include the creation or revision of reliability engineering guidelines to improve product field performance through design enhancements to meet reliability goals. Uses principles of performance evaluation and prediction to improve the reliability and maintainability of Cloud Infrastructure servers. Identifies, collects, analyzes, and manages various types of data to minimize failures and improve product performance. Develop scripts that represent the expected environment and operational conditions. Collaborate with other development functional teams and internal stakeholders regarding the application of Design for Reliability principles to ensure products meet customer expectations.

Requirements

  • Minimum B.S. in Electrical Engineering, Computer with Science/Engineering, or Software development and 8+ years of relevant work experience (alternatively an MS and 6+ years)
  • Knowledge of computer systems/hardware structure, as well as switch/network interfaces
  • Knowledge and/or experience with programming languages like Python or Unix (Bash and/or PowerShell)
  • Knowledge of statistical & probability techniques and reliability modeling
  • Ability to communicate, collaborate and lead cross-functionally to resolve issues, including those with customers.

Nice To Haves

  • Fundamental knowledge of Computer Architecture, Server architecture at the block level, and Hardware/Firmware/OS interactions
  • Working knowledge of PCBA (printed circuit board assembly) design, fabrication, and validation testing
  • Experience using tools such as ReliaSoft, JMP, and Mintab statistical software packages.
  • Working knowledge of electronic components/devices and their failure modes & failure mechanism
  • Knowledge of industry standards, IPC, JEDEC, Telcordia, and MIL-STD

Responsibilities

  • Using Design for Reliability principles to ensure the cloud hardware developed and delivered to data centers meet specified use-conditions and stresses to assure its design intent.
  • Act as the internal consultant on all reliability matters and interface with program management, vendors, and design engineering (as necessary) on key reliability programs/issues; supporting the Software/script development needs of the reliability team.
  • Creation or revision of reliability engineering guidelines to improve product field performance through design enhancements to meet reliability goals.
  • Uses principles of performance evaluation and prediction to improve the reliability and maintainability of Cloud Infrastructure servers.
  • Identifies, collects, analyzes, and manages various types of data to minimize failures and improve product performance.
  • Develop scripts that represent the expected environment and operational conditions.
  • Collaborate with other development functional teams and internal stakeholders regarding the application of Design for Reliability principles to ensure products meet customer expectations.

Benefits

  • Competitive base salary
  • Performance-based annual bonus eligibility
  • 401(k) retirement savings plan
  • Tuition reimbursement for eligible education programs
  • Comprehensive medical, dental, and vision coverage with access to leading providers
  • Mental health resources and employee wellness support programs-
  • Company-paid life and disability insurance
  • Paid time off (PTO) and company-paid holidays
  • Parental leave and family care support programs
  • Structured training programs and on-the-job learning opportunities
  • Matching gifts and volunteer programs to support causes you care about
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service