Together We Talent-posted about 14 hours ago
Mid Level
Onsite • Collegeville, PA

A leading technology organization is seeking an experienced Azure HPC Platform Engineer to architect, optimize, and support advanced cloud -based HPC systems. The ideal candidate will bring strong Azure infrastructure knowledge, Linux administration skills, and expertise in managing large -scale computing clusters using Slurm and Kubernetes. This is a fully onsite position based in Collegeville, PA. The Azure HPC Platform Engineer will be responsible for designing and maintaining the organization’s Azure -based HPC infrastructure. This role requires hands -on experience with Linux environments, workload scheduling, and container orchestration. The engineer will collaborate closely with developers, researchers, and system administrators to deliver scalable, secure, and efficient compute environments that power data -intensive applications—particularly within life sciences and research domains.

  • Design, deploy, and maintain Microsoft Azure HPC environments.
  • Manage and administer Linux -based HPC systems, including configuration, updates, and troubleshooting.
  • Configure and optimize Slurm workload managers and Kubernetes clusters for performance and scalability.
  • Support integration of Posit Workbench (RStudio) and Connect Package Manager administration.
  • Collaborate with stakeholders to implement HPC solutions tailored to research and computational needs.
  • Monitor system performance, identify bottlenecks, and implement improvements.
  • Ensure system security, compliance, and efficient resource utilization.
  • Contribute to automation, scripting, and configuration management.
  • Provide technical documentation and support for HPC users.
  • Bachelor’s degree in Computer Science, Engineering, or a related field.
  • 7+ years of experience with Microsoft Azure and cloud infrastructure management.
  • Proven expertise in HPC Platform architecture and administration.
  • Strong proficiency in Linux system administration and Azure Fundamentals.
  • Hands -on experience with Slurm workload management and Kubernetes orchestration.
  • Experience managing or supporting Posit Workbench and related tools.
  • Excellent analytical and problem -solving skills for diagnosing complex technical issues.
  • Strong communication and collaboration skills to work across multi -functional teams.
  • Background in Life Sciences or Research Computing environments.
  • Familiarity with Azure Infrastructure, Python scripting, or automation tools.
  • Ability to design and optimize compute and storage solutions for scientific workloads.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service