DevOps Engineer (AI/ML)

Global Payment Holding Company•Alpharetta, GA

About The Position

Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1,300 financial institutions and over 600 million cardholders grow with confidence and achieve amazing results. We are driven by our passion for success and we are proud to deliver best-in-class payment technology and software solutions. Join our dynamic team and make your mark on the payments technology landscape of tomorrow. At this time, we are unable to offer visa sponsorship for this position. Candidates must be legally authorized to work in the United States (or applicable country) on a full-time basis without the need for current or future immigration sponsorship. Please note, we are not accepting candidates on H1B or OPT status OVERVIEW We are looking for an experienced DevOps Engineer to support our AI and ML initiatives, including GenAI platform development, deployment automation, and infrastructure optimization. You will play a critical role in building and maintaining scalable, secure, and observable systems that power scalable RAG solutions, model training platforms, and agentic AI workflows across the enterprise.

Requirements

6+ years of DevOps or infrastructure engineering experience.
Preferably with 2+ years in AI/ML environments.
Hands-on experience with cloud-native services (AWS Bedrock/SageMaker, GCP Vertex AI, or Azure ML) and GPU infrastructure management.
Strong skills in CI/CD tools (GitHub Actions, ArgoCD, Jenkins) and configuration management (Ansible, Helm, etc.).
Proficient in scripting languages like Python, Bash, -Go or similar is a nice plus-.
Experience with monitoring, logging, and alerting systems for AI/ML workloads.
Deep understanding of Kubernetes and container lifecycle management.

Nice To Haves

Exposure to MLOps tooling such as MLflow, Kubeflow, SageMaker Pipelines, or Vertex Pipelines.
Familiarity with prompt engineering, model fine-tuning, and inference serving.
Experience with secure AI deployment and compliance frameworks
Knowledge of model versioning, drift detection, and scalable rollback strategies.

Responsibilities

Design and implement CI/CD pipelines for AI and ML model training, evaluation, and RAG system deployment (including LLMs, vectorDB, embedding and reranking models, governance and observability systems, and guardrails).
Provision and manage AI infrastructure across cloud hyperscalers (AWS/GCP), using infrastructure-as-code tools -strong preference for Terraform-.
Maintain containerized environments (Docker, Kubernetes) optimized for GPU workloads and distributed compute.
Support vector database, feature store, and embedding store deployments (e.g., pgVector, Pinecone, Redis, Featureform. MongoDB Atlas, etc).
Monitor and optimize performance, availability, and cost of AI workloads, using observability tools (e.g., Prometheus, Grafana, Datadog, or managed cloud offerings).
Collaborate with data scientists, AI/ML engineers, and other members of the platform team to ensure smooth transitions from experimentation to production.
Implement security best practices including secrets management, model access control, data encryption, and audit logging for AI pipelines.
Help support the deployment and orchestration of agentic AI systems (LangChain, LangGraph, CrewAI, Copilot Studio, AgentSpace, etc.).

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume