AI Platform Engineer - TS/SCI with CI Poly

PGTEKMcLean, VA
$125,000 - $185,000Onsite

About The Position

We are seeking an experienced AI Platform Engineer to play a critical role in building, maintaining, securing, and optimizing the infrastructure that supports advanced Artificial Intelligence (AI) workloads. This individual will be responsible for designing and managing scalable Kubernetes environments, implementing automated deployment pipelines, and ensuring platform reliability, security, and performance. The ideal candidate combines deep expertise in cloud-native technologies, Kubernetes administration, DevOps practices, and automation with strong problem-solving and collaboration skills. This role will work closely with engineering, operations, and security teams to deliver highly available AI platform solutions in a mission-critical environment.

Requirements

  • Extensive experience designing, deploying, and managing Kubernetes environments (EKS, AKS, GKE, OpenShift, or self-managed clusters).
  • Advanced knowledge of Docker and containerization technologies.
  • Strong understanding of Kubernetes networking, service meshes, and cluster architecture.
  • Expertise in Kubernetes security, access controls, and secret management.
  • Experience with CI/CD platforms such as Jenkins, GitLab CI/CD, GitHub Actions, Tekton, Argo Workflows, or similar.
  • Proficiency with Infrastructure as Code tools including Terraform, Pulumi, or CloudFormation.
  • Strong scripting and automation experience using Python, Go, Bash, or similar languages.
  • Experience with GitOps tools such as Argo CD.
  • Hands-on experience with monitoring and observability platforms including Prometheus, Grafana, ELK/OpenSearch, Datadog, or Splunk.
  • Strong Linux/Unix administration background.
  • Solid understanding of networking concepts including TCP/IP, DNS, HTTP, and load balancing.
  • Expert-level Git and version control experience.
  • Exceptional troubleshooting and analytical problem-solving abilities.
  • Strong verbal and written communication skills.
  • Ability to work effectively in cross-functional teams.
  • Experience mentoring engineers and leading technical efforts.
  • Strong sense of ownership and accountability.
  • Adaptability and commitment to continuous learning.
  • Active TS/SCI with Counterintelligence Polygraph clearance.
  • Current IAM Level II certification meeting DoD 8570 IAT requirements.

Nice To Haves

  • Certified Kubernetes Administrator (CKA)
  • Certified Kubernetes Application Developer (CKAD)
  • Certified Kubernetes Security Specialist (CKS)
  • AWS Certified DevOps Engineer
  • Azure DevOps Engineer Expert
  • Experience developing Kubernetes Operators and Custom Resource Definitions (CRDs)
  • Experience building Internal Developer Platforms (IDPs)
  • Familiarity with testing methodologies including unit, integration, and end-to-end testing

Responsibilities

  • Design, deploy, secure, maintain, and upgrade highly available Kubernetes clusters across cloud and on-premises environments.
  • Manage Kubernetes control plane components, worker nodes, and supporting infrastructure.
  • Implement and maintain containerized workloads using Docker and Kubernetes best practices.
  • Configure and manage Kubernetes resources including Pods, Deployments, StatefulSets, Services, Ingress, ConfigMaps, Secrets, Persistent Volumes, and Namespaces.
  • Support advanced networking configurations, including CNI plugins, network policies, service meshes, and DNS services.
  • Implement security best practices across Kubernetes environments.
  • Manage RBAC, admission controllers, vulnerability scanning, secret management, and network security controls.
  • Ensure platform compliance with government and organizational security requirements.
  • Support secure deployment practices and infrastructure hardening initiatives.
  • Design, implement, and maintain CI/CD pipelines for containerized applications.
  • Utilize GitOps methodologies and tools to automate application deployment and platform management.
  • Develop infrastructure as code (IaC) solutions using Terraform, Pulumi, CloudFormation, or similar tools.
  • Create automation scripts and tooling using Python, Go, Bash, or related languages.
  • Implement monitoring, logging, alerting, and observability solutions across platform environments.
  • Diagnose and resolve complex performance issues affecting Kubernetes clusters and applications.
  • Optimize resource utilization and platform scalability.
  • Support distributed tracing, centralized logging, and operational analytics initiatives.
  • Apply DevOps and Site Reliability Engineering (SRE) principles to improve platform resilience and operational excellence.
  • Collaborate with development, operations, security, and infrastructure teams.
  • Lead technical initiatives and mentor junior engineers.
  • Drive continuous improvement efforts across platform engineering and deployment practices.
  • Communicate effectively with technical and non-technical stakeholders.

Benefits

  • Comprehensive PPO medical coverage with access to a Health Savings Account (HSA) option
  • Vision plan
  • Dental insurance with the base dental plan option paid for by PGTEK
  • Life Insurance
  • Short and Long-Term disability
  • Critical Illness insurance
  • Matching 401(k) plan
  • Discount on pet insurance through ASPCA Pet Insurance
  • Employee Assistance Program
  • Generous amount of PTO and Holidays
  • Education Assistance Program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service