Dev Ops System Administrator ,Scottsdale, Arizona, US, Full Time/Per (MK)

Central Business SolutionsScottsdale, AZ
Onsite

About The Position

We are seeking a skilled AI DevOps System Administrator to build, manage, and optimize the infrastructure supporting our Artificial Intelligence and Machine Learning initiatives in a classified environment. The ideal candidate will be responsible for maintaining the CI/CD pipeline for ML models, managing GPU resources, and ensuring the stability, scalability, and security of the AI development and deployment environment. This role requires close collaboration with data scientists and ML engineers to streamline workflows from model development to production. As a seasoned leader, you’ll be involved with our client's decision-making process by serving as a front-line interface to users with technical issues and conducting systems analysis and development to keep systems current with changing technologies. Your duties may include installing new software, troubleshooting, granting permissions to applications and training users. You’ll also be responsible for the day-to-day support of server services by performing server administration for physical and virtual server operating systems and configuring, maintaining and troubleshooting of physical and virtual hardware and network related interfaces on servers. We’ll rely on you to perform, maintain, troubleshoot and conduct a complete analysis of alerts; create scripts to automate repetitive processes; and work with customers to identify, isolate, and resolve problems with servers that are affecting other services.

Requirements

  • Department of Defense TS/SCI with Polygraph security clearance is required at time of hire.
  • U.S. citizenship is required.
  • A Bachelor’s degree in Computer Science, a related field or equivalent experience plus a minimum of 8 years of relevant experience; or Master's degree plus 6 years of relevant experience
  • Advanced understanding of server based operating systems
  • Strong Linux/Container/AI Skills
  • Subject matter expert (SME) with the ability to mentor others on administrating the server environment
  • Enhanced troubleshooting skills within the server OS as well as both networking and storage technologies
  • Hands-on experience developing, deploying and supporting large-scale enterprise server solutions
  • Experience with Linux
  • Experience with Docker and/or Kubernetes
  • Active DoD TS/SCI with Poly clearance. There is no flex on this, cannot need reinstated or anything like that

Nice To Haves

  • Experience working with or familiarity with AI/ML models is preferred
  • Team player who thrives in collaborative environments and revels in team success
  • Broad understanding of the interrelationships within the IT environment with focus on server and services
  • Senior level knowledge of physical and virtual server support
  • Senior level knowledge of access, permissions and security that gives the clients the access to the data they need to perform their daily activities
  • Identifies opportunities to apply AI for continuous improvement and innovation

Responsibilities

  • Design, implement, and maintain scalable and robust infrastructure for AI/ML model training and inference.
  • Develop and manage CI/CD pipelines for automated building, testing, and deployment of AI applications and machine learning models.
  • Administer and optimize Linux-based systems and virtualized environments.
  • Manage containerization and orchestration platforms (e.g., Docker, Kubernetes) to deploy and scale ML services.
  • Automate infrastructure provisioning, configuration management, and deployment processes using Infrastructure as Code (IaC) tools like Ansible or Terraform.
  • Manage and allocate GPU resources efficiently for model training and other high-performance computing tasks.
  • Implement and maintain monitoring, logging, and alerting systems to ensure platform health and performance.
  • Collaborate with development teams to support their infrastructure needs and troubleshoot issues.
  • Installing new software, troubleshooting, granting permissions to applications and training users.
  • Day-to-day support of server services by performing server administration for physical and virtual server operating systems.
  • Configuring, maintaining and troubleshooting of physical and virtual hardware and network related interfaces on servers.
  • Perform, maintain, troubleshoot and conduct a complete analysis of alerts.
  • Create scripts to automate repetitive processes.
  • Work with customers to identify, isolate, and resolve problems with servers that are affecting other services.

Benefits

  • Flexible schedules including a 9/80 option (every other Friday off)
  • Generous paid time off and parental leave
  • 401(k) with 6% company match and immediate vesting
  • Comprehensive medical, dental, and vision insurance
  • Life and disability coverage
  • Tuition assistance for continued education
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service