AI Platform Runtime Engineer

NAVEXCharlotte, NC
Hybrid

About The Position

You will join our Artificial Intelligence and Machine Learning team that shares a passion for designing quality solutions, embracing new technologies and delivering powerful products within our integrated risk and compliance management platform that help our customers protect their reputation and bottom line. We are changing the way people experience life at work! As an AI Platform Runtime Engineer, you will build, test, deploy, and operate NAVEX’s agentic systems in production—multi-step workflows that combine LLM reasoning with retrieval and tool execution—while maintaining reliability, cost control, security, and measurable quality over time. You will own the core runtime infrastructure that powers the NAVEX AI Product System, including intelligence primitives, cloud integration, release hardening, and CI/CD pipelines that ensure AI experiences are resilient, deterministic, and production ready. If you want to build the runtime backbone of a governed, enterprise-grade agentic AI platform, this role is for you. You’ll thrive in this hybrid role surrounded by an engaged, collaborative team deeply committed to your success. Join us and help shape what’s next!

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field
  • 5+ years’ experience in platform engineering, infrastructure engineering, or backend systems development, with hands-on experience building production backend services, APIs, and distributed systems with a bias toward reliability and operability
  • Demonstrated agentic build experience (work or serious personal projects)—some experience building or contributing to agentic or LLM-based systems, including prototypes moved to production
  • Strong experience with AWS services, particularly Bedrock, Lambda, Step Functions, IAM, and related managed services
  • Experience building and operating multi-tenant SaaS platforms at production scale
  • Proficiency in Python and/or TypeScript/Node.js for backend service development
  • Experience with CI/CD pipeline design and infrastructure-as-code tools (Terraform, CDK, or CloudFormation)
  • Knowledge of containerization and orchestration (Docker, ECS, or Kubernetes)
  • Understanding of AI/ML runtime requirements including model serving, context management, and memory systems
  • Knowledge of non-deterministic system iteration loops—comfort working in an iterative “build, test, ship, observe, refine” cycle where agent behavior must be validated with systematic evaluation
  • Experience with observability tooling, cost monitoring, and operational runbook development
  • Culture Agility. Comfort working in a fast-paced, candid environment that values innovation, healthy debate, and follow-through

Responsibilities

  • Implement agent orchestration and multi-agent workflows—develop orchestration logic, including agent-to-agent communication patterns where needed
  • Build retrieval and grounding components—implement RAG pipelines and continuously evaluating and iterating to improve quality
  • Build and maintain core AI platform primitives (orchestration, context, memory boundaries, configuration layers)
  • Integrate and optimize AWS Bedrock as the managed execution substrate
  • Design and implement deterministic AI release bundles with versioned prompt and orchestration artifacts; build rollback mechanisms, tenant-level feature flags and controlled rollout infrastructure
  • Implement tenant isolation, session boundaries, and memory scoping across AI runtime
  • Build and maintain CI/CD pipelines for AI artifacts deployment and validation
  • Instrument agent observability and runtime monitoring—establish end-to-end tracing for agent runs, failure analysis, and latency and cost monitoring
  • Operationalize safe deployment practices
  • Implement security controls for tool-using agents—apply strict output validation and secure tool integration practices to mitigate common LLM risks and protect user and customer data
  • Build dashboards, alerts, and cost guardrails for monitoring AI runtime health
  • Collaborate with AI Architect and evaluation teams to ensure platform primitives support governance and quality gating requirements

Benefits

  • Clear, competitive compensation designed to recognize measurable outcomes and real impact
  • Opportunities for growth, leadership, and making an impact
  • Support, challenge, and reward for the impact you make
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service