Gen AI Developer Specialist

HEXAWAREUnited States,

About The Position

We are seeking an experienced Gen AI Developer Specialist with 8+ years of experience to design, build, and operate a secure, scalable, and cost-efficient enterprise Generative AI platform on AWS. This role will support production-grade LLM applications in regulated environments and involves owning the full GenAI platform lifecycle, including architecture, deployment, operations, monitoring, incident management, and continuous improvements. The specialist will implement and run AWS Bedrock-based solutions, enabling LLM inference, RAG, Agents, and Guardrails with high availability, fault tolerance, and SLA compliance. A key aspect of this role is establishing strong operational and governance frameworks covering observability, alerting, RCA, security controls, access management, compliance, and cost optimization. The ideal candidate will bring deep expertise in cloud ML platforms and financial services, with strong Python skills, AWS services knowledge, hands-on GenAI experience, and a background in production support, reliability engineering, and AI governance.

Requirements

  • 8+ years of experience in hands on exposure to AI/ML or Generative AI systems
  • Strong understanding of AI evaluation techniques, including hallucination detection, factual accuracy, bias, and output consistency
  • Knowledge of Responsible AI principles, including fairness, transparency, and explainability
  • Python (must-have)
  • Experience with REST APIs and microservices

Nice To Haves

  • AWS services knowledge
  • GenAI hands on experience
  • Background in production support, reliability engineering, and AI governance
  • Deep expertise in cloud ML platforms and financial services

Responsibilities

  • Design, build, and operate a secure, scalable, and cost efficient enterprise Generative AI platform on AWS, supporting production grade LLM applications in regulated environments.
  • Own the full GenAI platform lifecycle, including architecture, deployment, operations, monitoring, incident management, and continuous reliability and performance improvements.
  • Implement and run AWS Bedrock–based solutions, enabling LLM inference, RAG, Agents, and Guardrails with high availability, fault tolerance, and SLA compliance.
  • Establish strong operational and governance frameworks, covering observability, alerting, RCA, security controls, access management, compliance, and cost optimization.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service