Stratus Technologies-posted 8 months ago
$79,000 - $95,000/Yr
Full-time • Mid Level
Professional, Scientific, and Technical Services

At Penguin Solutions, we understand the boundless potential of technology and support our customers in turning cutting-edge ideas into outcomes—faster, and at any scale. With over two decades of experience as trusted advisors, Penguin Solutions is an end-to-end technology company solving the industry's most complex challenges in computing, memory, and LED solutions. Penguin designs, builds, deploys, and manages high-performance, high-availability enterprise solutions, allowing customers to achieve their breakthrough innovations. We are looking for a Data Center Technician seeking to apply their technical skills in a fast-paced and complex environment. A working knowledge of server hardware and the desire to participate in projects at a large-scale data center is central to this role. This position will work to resolve and diagnose compute issues at scale, escalate issues, and work with remote engineering teams. Additionally, this role will support rack lifecycle processes with a focus on helping build out and support cloud scale compute and storage environments. Solid communication skills are a requirement for this role.

  • Monitor and perform on-going maintenance on servers and network equipment including hardware troubleshooting and replacement of hardware components.
  • Provide support to the customer and staff, as well as respond to server and network hardware issues.
  • Perform complex hardware diagnostics and replace failing parts in a timely manner.
  • Collaborate with systems, software and network engineering teams on overall high-performance computing cluster health.
  • Move, replace, and upgrade internal system components, including CPUs, memory, hard drives, and network cables.
  • Work within the client ticketing system for all hardware related cluster observations.
  • Coordinate with logistics / inventory management team for hardware removal of components from the high-performance computing cluster.
  • Guide junior technicians on ticket resolution and complex observations with the high-performance computing cluster.
  • Support findings with root cause analysis and continued improvement.
  • Participate in an on-call rotation to provide critical support for AI and HPC operations.
  • 5+ years of experience installing, monitoring, and maintaining data center equipment.
  • Exceptional ability to work as part of a team, provide hardware support, and resolve problems.
  • Familiarity with equipment inventory management systems or related databases.
  • Proficiency in documenting processes through collaboration with other stakeholders.
  • Hands-on experience working with network cables, including MPO, and installing/replacing DIMMs, CPUs, and other system components.
  • Ability to collaborate with other 3rd party partners in support of the overall high-performance computing cluster health.
  • Exceptional communication skills.
  • Must be able to lift and move equipment weighing 50 pounds or more, as required by this role.
  • Willingness to respond to network and server errors after hours.
  • US Citizenship is required for this role.
  • Experience managing liquid-cooled hardware.
  • In-depth knowledge of data center power and cooling operations.
  • Ability to keep up with advancements in data center infrastructure and technologies.
  • CompTIA Server+ / A+ / Network+, or similar certification will be looked up favorably.
  • Medical, dental, and vision benefits available.
  • 401k saving plan.
  • Paid Time Off.
  • Life Insurance.
  • Employee Assistance Plan.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service