About The Position

While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth. If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi! About Quantiphi: Quantiphi is an award-winning, AI-First digital engineering and consulting company focused on delivering high-impact Services and Solutions that help organizations solve what truly matters. We partner with enterprises to reimagine their businesses through intelligent, scalable, and transformative AI driving measurable outcomes at the very core of their operations. Since our founding in 2013, Quantiphi has tackled some of the world’s most complex business challenges by combining deep industry expertise, disciplined cloud and data engineering practices, and cutting-edge applied AI research. Our work is rooted in delivering accelerated, quantifiable business value, not just technology for technology’s sake. Headquartered in Boston, Quantiphi is a global organization with 4,000+ professionals serving clients across key industry verticals, including BFSI, Healthcare & Life Sciences, CPG, MFG, TME etc. As an Elite and Premier partner to leading cloud and AI platforms such as NVIDIA, Google Cloud, AWS, and Snowflake, we build and deliver enterprise-grade AI services and solutions that create real-world impact. We’ve been recognized with: 17x Google Cloud Partner of the Year awards in the last 8 years. 3x AWS AI/ML award wins. 3x NVIDIA Partner of the Year titles. 2x Snowflake Partner of the Year awards. We have also garnered top analyst recognitions from Gartner, ISG, and Everest Group. We offer first-in-class industry solutions across Healthcare, Financial Services, Consumer Goods, Manufacturing, and more, powered by cutting-edge Generative AI and Agentic AI accelerators. We have been certified as a Great Place to Work for the third year in a row- 2021, 2022, 2023. Be part of a trailblazing team that’s shaping the future of AI, ML, and cloud innovation. Your next big opportunity starts here! For more details, visit: Website or LinkedIn Page.

Requirements

  • 6-8 years of hands on experience in machine learning and AI engineering with proven track record of taking ML systems to production
  • Demonstrated expertise in building multi-agent systems and agentic workflows, preferably with Langraph/CrewAI
  • Expert-level Python proficiency with ML frameworks (TensorFlow, PyTorch, Transformers).
  • Experience with FastAPI, async programming, and microservices architecture
  • Hands-on experience with vector databases (Pinecone, Weaviate, ChromaDB) and building scalable RAG systems
  • Experience with LLM application monitoring tools (LangSmith, Weights & Biases, custom telemetry solutions)
  • Proven ability to architect and implement complex AI systems from scratch in production environments
  • Production-level experience with at least one major cloud platform (AWS, GCP, or Azure), including: Compute services (EC2, GCE, Azure VMs)
  • Serverless functions (Lambda, Cloud Functions, Azure Functions)
  • Container orchestration (EKS, GKE, AKS)
  • Managed AI/ML services (SageMaker, Vertex AI, Azure ML)
  • Strong skills in Infrastructure as Code (Terraform, CloudFormation), CI/CD pipelines (GitHub Actions, Jenkins), and containerization (Docker, Kubernetes)
  • Exceptional problem-solving and analytical thinking with ability to tackle complex, ambiguous challenges
  • Strong communication skills to explain complex agentic concepts to both technical and non-technical stakeholders
  • Proven ability to work independently and drive large-scale projects to completion with minimal supervision
  • Leadership mindset with experience mentoring team members and driving technical excellence

Nice To Haves

  • Experience with prompt engineering techniques, fine-tuning SLMs (PEFT, SFT, RLHF), and model optimization
  • Knowledge of distributed systems, message queues, and event-driven architectures for agent coordination
  • Familiarity with SDLC best practices, version control (Git), and agile development methodologies
  • Experience with tool-calling agents, multi-step workflows, and stateful orchestration (e.g. graphs, planners, routers).
  • Hands-on evals for agents: trajectory / tool-use checks, golden traces, LLM-as-judge with fixed rubrics, regression suites.
  • Online evals, drift thinking, and clear quality gates before or after deploy (thresholds, alerts, rollback criteria).
  • Safety and abuse: prompt injection via tools, untrusted retrieval, PII handling in prompts and logs, allowlists and guardrails.
  • Cost and latency discipline: budgets per run, timeouts, caps on turns and tool calls.
  • Model lifecycle: routing / gateway patterns, version pinning, fallbacks, and which model for which step.
  • Memory and state: what is persisted, retention, redaction, and what must never be stored

Responsibilities

  • Architect & Build Agentic Systems: Design and develop end-to-end multi-agent systems from scratch. You will create the foundational agent harnesses, define communication protocols, and build orchestration layers using frameworks like CrewAI, Langgraph, and AutoGen.
  • Architectural decisions to ensure: Hierarchical and collaborative multi-agent structures with well-defined agent roles, responsibilities, and communication protocols
  • Dynamic task decomposition, sophisticated tool integration, planning mechanisms (ReAct), and self-correction loops
  • Develop state management systems and memory mechanisms for persistent agent interactions
  • Engineer Advanced Agent Capabilities: Develop custom agent-tools and define specialized agent-skills that empower agents to perform complex, domain-specific tasks.
  • Pioneer Context Engineering: Implement advanced context engineering and memory systems to ensure agents maintain state, learn from interactions, and make informed decisions in dynamic environments.
  • Own the deployment, scaling, and maintenance of robust, low-latency agentic systems on major cloud platforms (GCP, AWS, or Azure).
  • Implement best-in-class MLOps practices for monitoring, continuous integration/continuous deployment (CI/CD), and system reliability.
  • Integrate LLMs to serve as the core reasoning engines for autonomous agents. You will apply advanced techniques like RAG and PEFT to optimize performance.
  • Create and maintain comprehensive tool libraries for agents including API integrations, database queries, and external service connections
  • Design and implement RAG systems using vector databases (Pinecone, Weaviate, ChromaDB)
  • Develop custom tools and plugins that enable agents to interact with various enterprise systems and APIs
  • Ensure tool reliability, error handling, and seamless integration within agentic workflows
  • Implement comprehensive monitoring and tracing systems for agent behavior, performance, cost optimization, and latency analysis
  • Design novel evaluation frameworks to assess multi-step agentic task success, reliability, and accuracy
  • Utilize advanced observability tools (LangSmith, Arize AI, or custom solutions) to trace agent decision making processes
  • Establish metrics and KPIs for measuring agentic system performance in production environments

Benefits

  • Ample opportunities to learn, grow and interact with colleagues from varied experience and backgrounds around the globe.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

501-1,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service