About The Position

The isolved Senior Software Engineer, AI Platform role owns both program execution and technical direction, leading ~20 engineers across domain teams (Tax, Benefits, Time, Payroll, Shared Logic), alongside two Engineering Managers and a Data Architect. The position blends delivery leadership with deep technical involvement, serving as a key decision-maker, escalation point for complex challenges, and ultimate owner of program outcomes.

Requirements

  • 5+ years of professional software engineering experience, with Python as your primary language
  • 2+ years building production LLM-powered systems - inference, RAG, agentic patterns, or AI infrastructure
  • Deep Python expertise - this is the primary language for AI platform work
  • Working proficiency in C#/.NET - the platform serves teams that live in C#, so interop is real and matters
  • Strong hands-on experience with agentic frameworks - Semantic Kernel, LangGraph, LangChain, or you've built your own
  • Production experience with RAG architecture: chunking strategies, embedding models, vector search, retrieval quality, and the failure modes that don’t show up in demos
  • Azure AI Foundry / Azure OpenAI experience - model deployment, API integration, observability tooling
  • Experience building internal platforms or SDKs that other engineers depend on - you understand what makes a platform feel good to use
  • Strong grasp of AI observability: token usage, latency, cost tracking, and distributed tracing across multi-agent workflows

Nice To Haves

  • Experience with TypeScript and building developer SDKs or tooling
  • Hands-on experience with AI evaluation frameworks (LLM-as-judge, automated regression testing)
  • Knowledge of AI governance practices, including access control, audit logging, and security safeguards
  • Familiarity with container-based deployments (e.g., Azure Container Apps) and infrastructure-as-code (Terraform)
  • Awareness of AI regulatory frameworks such as NIST AI RMF or ISO/IEC 42001

Responsibilities

  • Design and build a scalable LLM gateway with model routing, prompt management, cost attribution, rate limiting, and caching
  • Develop and operate RAG pipelines, embedding services, and vector search infrastructure for platform-wide use
  • Implement platform-level cost optimization strategies, including semantic caching and model selection by workload
  • Build and maintain agentic runtime infrastructure, including orchestration, state management, and human-in-the-loop patterns
  • Develop extensible MCP server and tool ecosystems for product team integration
  • Design and support multi-agent coordination patterns using modern frameworks and protocols
  • Establish comprehensive AI observability, including usage, latency, cost tracking, and distributed tracing
  • Implement AI governance controls, including access management, audit logging, content filtering, and security protections
  • Build AI incident detection and response capabilities, including monitoring for failures, hallucinations, and cost anomalies
  • Create developer-friendly SDKs across languages (Python, .NET, TypeScript) to simplify platform adoption
  • Define "paved road" patterns for common AI use cases and support onboarding of product teams
  • Build automated evaluation pipelines and continuously monitor production quality and model performance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service