Site Reliability Engineer (Top Secret Clearance)

SpaceXHawthorne, CA
$145,000 - $175,000

About The Position

As a member of the Classified IT Systems Engineering team, the Site Reliability Engineer is involved in designing scalable systems capable of supporting a growing volume of data products being generated in mass. We build tools that enable us to work more efficiently, and that help us build software systems that are secure, reliable, and autonomous. Our engineers are responsible for the life cycle of the systems they create, including development, testing, and operational support.

Requirements

  • Bachelor’s degree in computer science, information systems/IT, or an engineering discipline; OR 2+ years of professional experience in software, DevOps, or site reliability engineering in lieu of a degree
  • 1+ year of experience with Kubernetes
  • 1+ year of experience with Linux operating systems
  • Experience in Bash, Python, and/or other scripting languages
  • Experience building, maintaining, and scaling on-premises and/or cloud systems designed

Nice To Haves

  • Active Top Secret, Top Secret SCI, or DOE Level Q clearance is highly desired
  • Experience hosting and pushing the state of the art in inferential model benchmarks
  • Experience with systems administration, site reliability engineering, or DevOps engineering
  • Experience with Python and Python-based development frameworks
  • Experience with virtualization and hypervisor technologies
  • Experience with automatically managing dozens or hundreds of servers
  • Knowledge of performance bottlenecks and performance improvement techniques
  • Excellent communications skills with the ability to communicate with customers, peers, management etc. in both formal and informal situations
  • Ability to quickly learn new tools and frameworks.

Responsibilities

  • Develop automation to deploy and manage compute resources both on-premises and in the cloud
  • Build, maintain, and scale on-premises hardware systems designed to host GPU-accelerated machine learning workloads
  • Deploy and manage core infrastructure such as databases, monitoring and storage
  • Closely collaborate with software engineers to create highly scalable, operable and maintainable products
  • Engage in and improve the whole lifecycle of services -- from inception and design, through deployment, operation and refinement

Benefits

  • company stock or long-term cash awards
  • potential discretionary bonuses
  • Employee Stock Purchase Plan
  • comprehensive medical, vision, and dental coverage
  • 401(k) retirement plan
  • short and long-term disability insurance
  • life insurance
  • paid parental leave
  • various other discounts and perks
  • 3 weeks of paid vacation
  • 10 or more paid holidays per year
  • paid sick leave
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service