Senior Systems, Network, and Storage Administrator

Morehouse CollegeAtlanta, GA
46d

About The Position

The Senior Systems, Network, and Storage Administrator is a key leadership position within the Morehouse Supercomputing Facility (MSF), responsible for the design, implementation, optimization, and day-to-day management of the institution's high-performance computing (HPC) infrastructure. This role supports the computational research and teaching needs of Morehouse College and its academic partners within the Atlanta University Center (AUC). The ideal candidate is an accomplished professional with extensive hands-on experience managing active HPC environments and large-scale networked systems. They will bring deep technical expertise, strategic foresight, and a collaborative spirit to build, sustain, and advance Morehouse's computing capabilities. This position is ideal for an accomplished professional currently serving as a Systems, Network, and Storage Administrator at a large Research I university who is ready to advance into a senior leadership role as the chief technical administrator of an emerging supercomputing facility.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, Information Technology, or a closely related field.
  • Minimum of 5-10 years of progressive experience in systems, network, or storage administration, with at least 3 years in an active HPC environment.
  • Proven expertise with Linux/Unix system administration, including installation, configuration, troubleshooting, and performance tuning.
  • Advanced knowledge of LAN/WAN design, firewalls, routing, and network protocols.
  • Demonstrated proficiency in scripting and automation (Python, Bash, or Perl).
  • Experience managing large-scale storage systems, data backup, and recovery solutions.
  • Familiarity with virtualization technologies such as VMware, KVM, or Hyper-V.
  • Experience designing and optimizing complex, distributed computing systems.
  • Strategic thinker capable of designing and scaling complex systems.
  • Strong analytical and troubleshooting abilities.
  • Excellent written and verbal communication skills.
  • Commitment to diversity, equity, and inclusion in technology and education.
  • Ability to work independently and collaboratively in a fast-paced, research-focused environment.

Nice To Haves

  • Vendor-specific certifications such as: VMware Certified Professional (VCP) Red Hat Certified Engineer (RHCE) or System Administrator (RHCSA) Cisco Certified Network Associate (CCNA) CompTIA Security+
  • Master's degree in a computing-related discipline.
  • Direct experience supporting HPC clusters, GPU computing, or research data workflows.
  • Experience with configuration management (e.g., Ansible, Puppet, or Chef).
  • Demonstrated ability to collaborate effectively with researchers, faculty, and technical teams.

Responsibilities

  • Lead the configuration, deployment, maintenance, and optimization of HPC clusters, including compute nodes, storage systems, and networking components.
  • Ensure maximum uptime, scalability, and performance of all systems supporting research and instructional workloads.
  • Manage user accounts, access permissions, and resource scheduling (e.g., SLURM or equivalent).
  • Design, implement, and maintain LAN/WAN infrastructure, routing, firewalls, and switches to ensure secure, high-speed connectivity across the facility.
  • Collaborate with campus IT and security teams to ensure compliance with cybersecurity standards and institutional policies.
  • Monitor network performance and respond proactively to potential issues.
  • Manage large-scale, distributed storage systems to support data-intensive research applications.
  • Implement and maintain reliable data backup, replication, and disaster recovery solutions.
  • Optimize storage performance for HPC and research data workflows.
  • Architect system improvements and infrastructure expansions based on performance metrics, emerging technologies, and institutional research needs.
  • Collaborate with faculty and research teams to tailor computing solutions to specific project requirements.
  • Evaluate and integrate new technologies (e.g., GPUs, cloud HPC, hybrid computing solutions) to enhance system capabilities.
  • Develop and maintain automation scripts for system provisioning, monitoring, and maintenance using Python, Bash, or similar tools.
  • Oversee virtualization technologies (e.g., VMware, KVM) to optimize system efficiency and resource utilization.
  • Serve as the primary technical advisor to the Director of MSF on infrastructure strategy, procurement, and operations.
  • Mentor junior staff, student interns, and research assistants in HPC operations and systems management.
  • Coordinate with AUC and external institutional partners on collaborative HPC and data initiatives.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service