About The Position

We’re hiring a Senior Applied AI Engineer to build production AI systems that real customers depend on. This role is for an experienced software engineer who also understands modern AI systems. You should be comfortable building with LLMs, agents, retrieval pipelines, and workflow orchestration, but just as comfortable thinking about system design, reliability, testing, deployment, debugging, and long-term maintainability. You’ll work on everything from agent workflows and retrieval systems to backend APIs, evaluation tooling, observability, and production infrastructure. We care a lot about engineering quality. That means building systems that are understandable, testable, observable, and reliable in production. We are looking for someone who can help raise the engineering bar around AI development and bring strong technical judgment to a fast-moving environment.

Requirements

  • US Citizen or authorized to work in US
  • 5+ years of professional software engineering experience building production systems
  • Strong proficiency in Python
  • Strong backend engineering fundamentals and experience building scalable APIs, services, distributed systems, or workflow orchestration platforms
  • Proven hands-on experience building and shipping AI-powered applications using LLMs, generative AI APIs, agents, retrieval systems, or related technologies in production environments
  • Experience designing and implementing agentic workflows, tool-calling systems, structured outputs, prompt pipelines, or retrieval-augmented generation architectures
  • Strong understanding of the practical challenges involved in production AI systems, including hallucination mitigation, evaluation, reliability, observability, latency, and cost management
  • Experience building production software systems with strong engineering standards around testing, QA, deployment, monitoring, and maintainability
  • Strong understanding of modern software engineering practices, including Git workflows, code review, CI/CD, automated testing, operational debugging, and release management
  • Experience working with cloud infrastructure, preferably AWS
  • Experience working with SQL and/or NoSQL databases
  • Strong debugging, systems-thinking, and problem-solving skills
  • Ability to operate effectively in fast-moving environments with evolving requirements and imperfect information
  • Strong communication skills and ability to collaborate across technical and non-technical teams

Nice To Haves

  • Experience with Amazon Bedrock, AWS Lambda, Step Functions, S3, DynamoDB, RDS, SQS, EventBridge, or related AWS services
  • Experience with LangGraph, LangChain, DSPy, Semantic Kernel, or similar orchestration frameworks
  • Experience building multi-step agents that interact with tools, APIs, external systems, or business workflows
  • Experience implementing AI evaluation systems, prompt regression testing, trace analysis, or human-in-the-loop review workflows
  • Experience with vector databases and semantic retrieval systems such as OpenSearch, pgvector, Pinecone, Weaviate, FAISS, or similar technologies
  • Experience with observability and LLMOps tooling such as LangSmith, Arize, Helicone, Weights & Biases, OpenTelemetry, or similar platforms
  • Experience balancing quality, latency, reliability, and cost tradeoffs in production AI systems
  • Experience mentoring engineers and helping establish strong engineering culture and development practices
  • Experience working in startup or high-ownership product environments
  • Ability to think critically about edge cases, failure modes, operational risk, and long-term maintainability

Responsibilities

  • Design, build, and maintain production-grade AI systems and customer-facing AI features
  • Develop agentic workflows using LLMs, retrieval systems, tools, APIs, and backend services
  • Build backend services, orchestration systems, automation, and infrastructure supporting AI-powered workflows
  • Design and implement retrieval-augmented generation (RAG) systems, including ingestion pipelines, embeddings, semantic retrieval, and context assembly
  • Integrate foundation models through platforms such as Amazon Bedrock or Agent Core
  • Develop robust prompting strategies, structured outputs, guardrails, and workflow logic for production use cases
  • Implement evaluation systems for prompts, agents, and workflows, including regression testing, trace review, golden datasets, and human QA processes
  • Monitor and improve production AI systems for quality, reliability, latency, observability, and cost efficiency
  • Debug AI behavior through logs, traces, evaluations, user feedback, and production telemetry
  • Collaborate closely with engineering, product, operations, and customer-facing teams to turn ambiguous requirements into reliable systems
  • Help establish strong engineering standards around testing, deployment, CI/CD, version control workflows, code review, and operational reliability
  • Mentor and collaborate with engineers across both software and AI disciplines
  • Evaluate emerging AI technologies pragmatically based on business impact, maintainability, and operational reliability
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service