Hardware Engineer - Server Hardware Management

Morgan StanleyNew York, NY
1d$120,000 - $165,000

About The Position

Work on business-enabling infrastructure projects utilizing leading edge CPU, GPU, APU, storage and networking architectures, security strengthening and operational scaling. Create clear procedures for testing hardware internally, deploying systems, optimizing performance, and resolving technical issues. Develop and maintain thorough documentation covering hardware designs, specifications, testing procedures, and results. Conduct thorough evaluations of hardware systems to identify operational problems and recommend effective improvements that boost overall efficiency. Create and build software solutions that include in-house systems, third-party vendor platforms, and open-source technologies. Deliver dependable automation solutions designed to enhance the management of our Innovation Lab servers and network infrastructure. This includes facilitating remote access, updating firmware via IDRAC/IPMI, and integrating peripheral devices efficiently. Troubleshoot complex problems involving software programs, operating systems, and hardware components. Assess and certify each new or replacement device, providing thorough analysis of how they integrate with the MS plant. We do it in a way that's differentiated - and we've done that for 90 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren't just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There's also ample opportunity to move about the business for those who show passion and grit in their work. To learn more about our offices across the globe, please copy and paste https://www.morganstanley.com/about-us/global-offices​ into your browser. Expected base pay rates for the role will be between $120,000 and $165,000 per year at the commencement of employment. Consequently, our recruiting efforts reflect our desire to attract and retain the best and brightest from all talent pools. We want to be the first choice for prospective employees. It is the policy of the Firm to ensure equal employment opportunity without discrimination or harassment on the basis of race, color, religion, creed, age, sex, sex stereotype, gender, gender identity or expression, transgender, sexual orientation, national origin, citizenship, disability, marital and civil partnership/union status, pregnancy, veteran or military service status, genetic information, or any other characteristic protected by law.

Requirements

  • Minimum four years of hands-on experience supporting and troubleshooting data center GPUs, including H100 and NVIDIA DGX B300 series or newer.
  • Demonstrated proficiency with advanced technologies, including Infiniband and NVLink.
  • Strong proficiency in Ansible and Python.
  • Experience with IPMI and preferably Redfish for programmatic communication with server BMCs.
  • Ability to collaborate effectively with engineers and developers in Agile environments.
  • Experience of managing, deploying, and troubleshooting, large scale production environments including application of security principles and system hardening.
  • Knowledge of Linux, and O/S and network protocols.
  • Knowledge of x86 hardware and peripherals, including Out of Band or Lights out Management.
  • In-depth knowledge of server hardware, components, and management technologies, particularly GPUs and PCIe devices.
  • Effective troubleshooting skills across hardware, O/S, network, and storage.
  • Masters's degree in computer science, computer engineering, or equivalent experience
  • Excellent communication skills are paired with strong self-management capabilities

Nice To Haves

  • Networking knowledge is an added advantage
  • Experience working in Financial Services or Enterprise Technology firms is preferred but not mandatory
  • Experience of driving enterprise-level initiatives, working with senior stakeholders across various regions and cultures

Responsibilities

  • Create clear procedures for testing hardware internally, deploying systems, optimizing performance, and resolving technical issues.
  • Develop and maintain thorough documentation covering hardware designs, specifications, testing procedures, and results.
  • Conduct thorough evaluations of hardware systems to identify operational problems and recommend effective improvements that boost overall efficiency.
  • Create and build software solutions that include in-house systems, third-party vendor platforms, and open-source technologies.
  • Deliver dependable automation solutions designed to enhance the management of our Innovation Lab servers and network infrastructure. This includes facilitating remote access, updating firmware via IDRAC/IPMI, and integrating peripheral devices efficiently.
  • Troubleshoot complex problems involving software programs, operating systems, and hardware components.
  • Assess and certify each new or replacement device, providing thorough analysis of how they integrate with the MS plant.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service