Senior Applied AI Engineer

Givzey

About The Position

We’re hiring a Senior Applied AI Engineer to build production AI systems that real customers depend on. This role is for an experienced software engineer who also understands modern AI systems. You should be comfortable building with LLMs, agents, retrieval pipelines, and workflow orchestration, but just as comfortable thinking about system design, reliability, testing, deployment, debugging, and long-term maintainability. You’ll work on everything from agent workflows and retrieval systems to backend APIs, evaluation tooling, observability, and production infrastructure. We care a lot about engineering quality. That means building systems that are understandable, testable, observable, and reliable in production. We are looking for someone who can help raise the engineering bar around AI development and bring strong technical judgment to a fast-moving environment.

Requirements

US Citizen or authorized to work in US
5+ years of professional software engineering experience building production systems
Strong proficiency in Python
Strong backend engineering fundamentals and experience building scalable APIs, services, distributed systems, or workflow orchestration platforms
Proven hands-on experience building and shipping AI-powered applications using LLMs, generative AI APIs, agents, retrieval systems, or related technologies in production environments
Experience designing and implementing agentic workflows, tool-calling systems, structured outputs, prompt pipelines, or retrieval-augmented generation architectures
Strong understanding of the practical challenges involved in production AI systems, including hallucination mitigation, evaluation, reliability, observability, latency, and cost management
Experience building production software systems with strong engineering standards around testing, QA, deployment, monitoring, and maintainability
Strong understanding of modern software engineering practices, including Git workflows, code review, CI/CD, automated testing, operational debugging, and release management
Experience working with cloud infrastructure, preferably AWS
Experience working with SQL and/or NoSQL databases
Strong debugging, systems-thinking, and problem-solving skills
Ability to operate effectively in fast-moving environments with evolving requirements and imperfect information
Strong communication skills and ability to collaborate across technical and non-technical teams

Nice To Haves

Experience with Amazon Bedrock, AWS Lambda, Step Functions, S3, DynamoDB, RDS, SQS, EventBridge, or related AWS services
Experience with LangGraph, LangChain, DSPy, Semantic Kernel, or similar orchestration frameworks
Experience building multi-step agents that interact with tools, APIs, external systems, or business workflows
Experience implementing AI evaluation systems, prompt regression testing, trace analysis, or human-in-the-loop review workflows
Experience with vector databases and semantic retrieval systems such as OpenSearch, pgvector, Pinecone, Weaviate, FAISS, or similar technologies
Experience with observability and LLMOps tooling such as LangSmith, Arize, Helicone, Weights & Biases, OpenTelemetry, or similar platforms
Experience balancing quality, latency, reliability, and cost tradeoffs in production AI systems
Experience mentoring engineers and helping establish strong engineering culture and development practices
Experience working in startup or high-ownership product environments
Ability to think critically about edge cases, failure modes, operational risk, and long-term maintainability

Responsibilities

Design, build, and maintain production-grade AI systems and customer-facing AI features
Develop agentic workflows using LLMs, retrieval systems, tools, APIs, and backend services
Build backend services, orchestration systems, automation, and infrastructure supporting AI-powered workflows
Design and implement retrieval-augmented generation (RAG) systems, including ingestion pipelines, embeddings, semantic retrieval, and context assembly
Integrate foundation models through platforms such as Amazon Bedrock or Agent Core
Develop robust prompting strategies, structured outputs, guardrails, and workflow logic for production use cases
Implement evaluation systems for prompts, agents, and workflows, including regression testing, trace review, golden datasets, and human QA processes
Monitor and improve production AI systems for quality, reliability, latency, observability, and cost efficiency
Debug AI behavior through logs, traces, evaluations, user feedback, and production telemetry
Collaborate closely with engineering, product, operations, and customer-facing teams to turn ambiguous requirements into reliable systems
Help establish strong engineering standards around testing, deployment, CI/CD, version control workflows, code review, and operational reliability
Mentor and collaborate with engineers across both software and AI disciplines
Evaluate emerging AI technologies pragmatically based on business impact, maintainability, and operational reliability