MLOps Platform Engineer

CGIReston, VA
8h$107,700 - $154,300Hybrid

About The Position

CGI has an immediate need for a MLOps Platform Engineer to join our team. This is an exciting opportunity to work in a fast-paced team environment supporting one of the largest customers. We take an innovative approach to supporting our client, working side-by-side in an agile environment using emerging technologies. We partner with 15 of the top 20 banks globally, and our top 10 banking clients have worked with us for an average of 26 years!. This role is located at a client site in Reston, VA. A hybrid working model is acceptable. The Data Modeling, Analytics & AI Engineering team is seeking a hands-on MLOps Platform Engineer to design, build, and operate enterprise-grade machine learning platforms. This role focuses on enabling scalable, secure, and reliable ML model development and deployment across AWS cloud environments and Kubernetes (EKS) clusters. You will play a key role in engineering and supporting infrastructure for ML training, batch inference, and real-time model serving. The position requires strong platform engineering fundamentals, CI/CD automation expertise, and experience operating containerized workloads in production environments. This role works closely with Data Scientists, ML Engineers, and application teams to operationalize end-to-end ML solutions while ensuring performance, governance, and cost efficiency in a regulated enterprise environment.

Requirements

  • 7+ years of experience with AWS services including EKS, EC2, S3, IAM, CloudWatch, and ECR
  • Strong operational knowledge of Kubernetes, preferably AWS EKS
  • Experience designing and managing containerized workloads (Docker)
  • Proficiency in Python and Bash scripting
  • Experience building and maintaining CI/CD pipelines (GitLab or equivalent)
  • Familiarity with ML workflows including training, inference, and model monitoring
  • Experience with Infrastructure as Code (Terraform or CloudFormation)
  • Experience supporting production platforms, including incident response and root cause analysis
  • Strong understanding of RBAC, network policies, and multi-tenant Kubernetes designs
  • Knowledge of monitoring, logging, observability, and performance tuning practices
  • Experience with ML platforms such as Domino or Amazon SageMaker
  • Familiarity with MLflow or similar ML lifecycle tools
  • Experience supporting GPU-based workloads or distributed training
  • Understanding of enterprise MLOps architecture patterns (batch, real-time, microservices)
  • Exposure to data processing frameworks and feature pipelines

Benefits

  • Competitive compensation
  • Comprehensive insurance options
  • Matching contributions through the 401(k) plan and the share purchase plan
  • Paid time off for vacation, holidays, and sick time
  • Paid parental leave
  • Learning opportunities and tuition assistance
  • Wellness and Well-being programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service