Senior AI Systems Administrator

MCI Careers,
Onsite

About The Position

MCI is seeking a Senior AI Systems Administrator to oversee, optimize, and maintain the infrastructure supporting AI and machine learning platforms. This role is responsible for ensuring high availability, security, scalability, and operational excellence across AI environments while supporting model deployment, training workloads, and enterprise AI initiatives. You will work closely with AI engineers, data scientists, and infrastructure teams to ensure reliable and efficient AI operations. To be considered for this role, you must complete a full application on our company careers page, including all screening questions and a brief pre-employment test.

Requirements

  • Bachelor's Degree in Information Technology, Computer Science, Systems Engineering, or a related field.
  • Minimum 5 years of infrastructure or systems administration experience.
  • Experience supporting AI, machine learning, or high-performance computing environments.
  • Strong knowledge of cloud platforms and virtualization technologies.
  • Experience with Linux and Windows server administration.
  • Knowledge of networking, security, and storage technologies.
  • Experience supporting enterprise-scale systems.
  • Strong troubleshooting and performance optimization skills.
  • Experience implementing automation solutions.
  • Knowledge of containerization technologies.
  • Strong communication and stakeholder management abilities.
  • Experience working within complex technology environments.

Nice To Haves

  • Experience with Kubernetes and Docker.
  • Experience supporting GPU clusters.
  • Cloud certifications in AWS, Azure, or GCP.
  • Knowledge of MLOps and AI deployment practices.
  • Experience with Terraform or Infrastructure as Code.
  • Knowledge of AI governance frameworks.
  • Security certifications.
  • Experience leading infrastructure teams.

Responsibilities

  • Oversee AI and machine learning infrastructure.
  • Manage servers, cloud environments, and compute resources.
  • Ensure system performance, reliability, and uptime.
  • Support GPU and CPU resource allocation.
  • Design scalable environments for AI workloads.
  • Optimize infrastructure performance and efficiency.
  • Implement automation and operational improvements.
  • Support infrastructure modernization initiatives.
  • Maintain security controls and access management.
  • Implement infrastructure governance standards.
  • Ensure compliance with organizational requirements.
  • Support risk management initiatives.
  • Monitor system health and performance.
  • Troubleshoot infrastructure and application issues.
  • Conduct root cause analysis and remediation activities.
  • Maintain operational reporting and documentation.
  • Mentor junior administrators and support teams.
  • Evaluate emerging technologies and best practices.
  • Recommend improvements to infrastructure architecture.
  • Support AI platform growth and scalability.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service