Staff Systems Engineer- Distributed Compute Engineering

Visa-posted about 9 hours ago

Full-time • Mid Level

Hybrid • Ashburn, VA

5,001-10,000 employees

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

IaaS Systems and Storage & Engineering (ISSE) team is part of the Operations & Infrastructure technology organization. Distributed Compute engineering (DCE) is part of ISSE and Hardware engineering is part of DCE. Our vision, mission and purpose are summarized as following: Vision: To become a leading technical engineering professional, pioneering in the design and automation of server infrastructure. We envision creating highly secure and efficient operations environments that drive business success and technological advancement. Mission: Our mission is to deliver high-quality server infrastructure design and automated implementation. We are committed to operating in complex, highly secure, and highly available environments, while maintaining rigorous operations, security, and procedural models. Purpose: The purpose of this role is to utilize strong hands-on technical engineering skills to design and automate the implementation of server infrastructure based on business requirements. This role will interact with technology domain experts to maintain high security and availability in complex operational environments, thereby driving business efficiency and security. Essential Functions: Extensive Datacenter Experience: Proficient in managing complex, geographically distributed IT infrastructures to ensure high availability and performance. Advanced Technical Knowledge: Profound understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E). Enterprise Server and Component Expertise: In-depth knowledge of server components (storage and network controllers, HBA, SSDs) and their functionalities, essential for maintaining high-performance compute environments. Processor and GPU Systems Proficiency: Strong grasp of Intel AMD architectures, GPU systems, memory hierarchy, and hardware-level security to enhance system performance and reliability. Out-of-Band, UEFI, and BIOS Expertise: Comprehensive understanding of out-of-band management, UEFI, BIOS settings, and their impact on system performance and security in high-performance computing environments. Hardware Lifecycle Management: Experienced in hardware lifecycle management, including firmware and OS driver certifications, to ensure the longevity and reliability of compute resources. Scripting Proficiency: Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks. Team and Independent Work: Highly motivated, excellent team player, capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff. This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.

managing complex, geographically distributed IT infrastructures to ensure high availability and performance
understanding of high-performance, highly available, and secure computing systems utilizing x86 technologies and protocols (NVME, GPU, PCI-E)
knowledge of server components (storage and network controllers, HBA, SSDs) and their functionalities
grasp of Intel AMD architectures, GPU systems, memory hierarchy, and hardware-level security
understanding of out-of-band management, UEFI, BIOS settings
Experienced in hardware lifecycle management, including firmware and OS driver certifications
Advanced skills in scripting languages such as PowerShell and Python to automate and optimize infrastructure tasks
capable of working independently, with strong analytical and troubleshooting abilities to resolve complex issues and mentor junior staff

5+ years of relevant work experience with a Bachelor’s Degree or at least 2 years of work experience with an Advanced degree (e.g. Masters, MBA, JD, MD) or 0 years of work experience with a PhD, OR 8+ years of relevant work experience.

6 or more years of work experience with a Bachelors Degree or 4 or more years of relevant experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or up to 3 years of relevant experience with a PhD
Bachelor's degree or higher in Computer Science, Information Systems, Computer Engineering, Electrical or other relevant engineering field
Broad knowledge in hardware, software, network, and applications deployments thru automation
Hardware and infrastructure automation experience in at least one of the following server product lines - HP ProLiant, Dell PowerEdge.
Operating Systems: In-depth experience with RHEL 8 and 9, Rocky 8 and9, and or Ubuntu 18.04 and 20.04.
Strong technical analytical and troubleshooting skills and possess an ability to explain technical concepts and provide guidance to junior staff.
Experience in system monitoring with tools supporting unattended operations.
Engineering Knowledge to troubleshoot and solve storage issues (Hosts, SAN switches, and Storage Devices).
Engineering knowledge in TCPIP networking – link aggregation redundancy, switches, routing, and load-balancing.
Ability to write technical designs, documentation, and presentations for Compute Infrastructure.
Ability to provide level 3 support and guide level 2 administrators on problem resolution.

Medical
Dental
Vision
401 (k)
FSA/HSA
Life Insurance
Paid Time Off
Wellness Program

Track Jobs with Teal

Job Search Resources

•

AI Resume Builder

•

Staff Software Engineer Resume Examples

•

Cover Letter Examples

Staff Systems Engineer- Distributed Compute Engineering

Job Search Resources

Tools

Career Hubs

Guides

Company