AI Support Engineer (II+)

Global Payment Holding CompanyAlpharetta, GA

About The Position

Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1,300 financial institutions and over 600 million cardholders grow with confidence and achieve amazing results. We are driven by our passion for success and we are proud to deliver best-in-class payment technology and software solutions. Join our dynamic team and make your mark on the payments technology landscape of tomorrow. At this time, we are unable to offer visa sponsorship for this position. Candidates must be legally authorized to work for any employer in the United States (or (applicable country) on a full-time basis without the need for current or future immigration sponsorship.

Requirements

  • 4+ years of experience in production support, software engineering, site reliability engineering (SRE), or DevOps—preferably supporting GenAI and/or ML systems.
  • Strong understanding of cloud infrastructure (AWS, GCP) and AI observability tools (e.g., Fiddler AI, Arize AI, IBM WatsonX.governance, etc.).
  • Experience with LLM and GenAI systems (OpenAI, Azure OpenAI, Bedrock, Vertex AI, or similar).
  • Familiarity with modern orchestration and agentic frameworks such as LangChain, LangGraph, Autogen, or CrewAI.
  • Proficiency in Python or shell scripting for automation and troubleshooting.
  • Strong analytical, communication, and incident management skills.
  • Bachelor’s degree in Computer Science, Engineering, or a related field.
  • 1+ years of experience in AI/ML engineering, with a focus on Generative AI.
  • Proficiency in programming languages such as Python.
  • Strong understanding of Generative AI models (e.g., GPT, Transformer architectures) and experience in distilling, tuning and training them.
  • Familiarity with Retrieval Augmented Generation (RAG) techniques and their implementation.
  • Experience with agentic AI concepts and developing autonomous AI workflows.
  • Hands-on experience with GCP Vertex AI, AWS Bedrock + Sagemaker, and Snowflake Cortex platforms and their AI/ML capabilities.
  • Experience building production-grade AI/ML systems at scale.
  • Knowledge of MLOps practices, including model deployment and lifecycle management.
  • Excellent problem-solving and analytical skills.
  • Excellent communication and collaboration skills.
  • Availability for on-call rotation and support.

Nice To Haves

  • Familiarity with Prompt Engineering, RLHF, and model evaluation techniques.
  • Understanding of AI governance, safety, and responsible principles.
  • Understanding of reinforcement learning and its application in agentic AI.
  • Familiarity with big data technologies (Apache Spark, Kafka).
  • Experience with CI/CD tools and automation for AI/ML workflows.
  • Experience with real-time data processing and streaming analytics.

Responsibilities

  • Serve as the first line of defense for production AI incidents, ensuring rapid triage, root cause analysis, and resolution.
  • Monitor system health and performance of deployed AI applications, agentic and RAG-based solutions, MCPs, and orchestration platforms.
  • Track and investigate issues related to latency, failures, model drift, hallucination, prompt misbehavior, or broken integrations, escalating to the AI engineering group where appropriate.
  • Collaborate with AI and platform engineers to implement observability, logging, and alerting best practices for all AI services.
  • Build diagnostic tools, runbooks, and automated workflows to improve incident response time and reduce manual intervention.
  • Maintain knowledge bases and playbooks for repeatable troubleshooting and knowledge transfer.
  • Partner with governance and compliance teams to ensure incidents are documented and remediated in line with internal policy.
  • Contribute to postmortems and continuous improvement efforts to harden production systems.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service