Sr. DevOps Engineer (HPC)

Space Exploration TechnologiesHawthorne, CA
35d

About The Position

SpaceX is looking for a Sr. DevOps Engineer with strong knowledge and experience in a world class engineering organization. This employee will be a member of the HPC team and will support SpaceX personnel and proprietary systems. The ideal candidate will be flexible and flourish in a fast paced and challenging environment. They should be a self-starter, self-motivator and possess ingenuity to excel at this position.

Requirements

  • 5+ years of hands-on experience with client and server hardware/software, management tools, enterprise networking, virtualization, and security technologies.
  • Bachelor's degree in computer science, engineering, math, or scientific discipline and 5+ years of systems engineering experience; OR 7+ years of professional experience building software in lieu of a degree.
  • Experience with Linux.

Nice To Haves

  • 5+ years of professional experience building, deploying and troubleshooting Linux systems.
  • Experience with a scripting language (Bash, Python) to automate and solve reoccurring tasks.
  • Experience building, deploying and troubleshooting HPC clusters.
  • Familiarity with cluster resource managers (Slurm, PBS, LSF).
  • Experience with monitoring and alerting technologies (Prometheus, Grafana, Nagios).
  • Familiarity with scientific and engineering computing (CFD, FEA).
  • Familiarity with ML frameworks (PyTorch, Tensorflow).
  • Familiarity with GPU usage in a compute cluster and Cuda.
  • Experience with containers (Docker, Podman, Singularity).
  • Experience deploying and maintaining automated configuration management software (Puppet, Ansible).
  • Comfortable working with mission critical and sensitive systems, with a sense of urgency appropriate to the responsibilities.
  • Eligibility for access to classified material up to TS/SCI with Polygraph.

Responsibilities

  • Administer and manage HPC clusters, storage systems, and high-speed networks.
  • Provide application support to SpaceX employees across engineering disciplines.
  • Install and integrate Linux-based compute clusters.
  • Write instructional documentation and convey highly technical ideas in non-technical terms.

Benefits

  • Base salary is just one part of your total rewards package at SpaceX. You may also be eligible for long-term incentives, in the form of company stock, stock options, or long-term cash awards, as well as potential discretionary bonuses and the ability to purchase additional stock at a discount through an Employee Stock Purchase Plan.
  • You will also receive access to comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short and long-term disability insurance, life insurance, paid parental leave, and various other discounts and perks.
  • You may also accrue 3 weeks of paid vacation and will be eligible for 10 or more paid holidays per year.
  • Employees accrue paid sick leave pursuant to Company policy which satisfies or exceeds the accrual, carryover, and use requirements of the law.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Transportation Equipment Manufacturing

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service