AI/ML Engineer - Agentic

Hewlett Packard EnterpriseSan Jose, CA
Hybrid

About The Position

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE. The AI/ML Engineer – Agentic is a senior individual contributor responsible for designing, building, and operating a production-grade agentic orchestration platform, including multi-agent workflows and MCP server–based tool infrastructure. The role focuses on enterprise-scale LLM integration, shared retrieval and memory services, and high‑performance backend systems that power agent execution. This position owns reliability, observability, and cloud-native operations for non-deterministic agentic systems in production. Contributions include applying developed subject matter expertise to solve common and sometimes complex technical problems and recommending alternatives where necessary. Might act as project lead and provide assistance to lower level professionals. Exercises independent judgment and consults with others to determine best method for accomplishing work and achieving objectives.

Requirements

  • Bachelor’s degree in computer science, engineering, information systems, or closely related quantitative discipline.
  • Typically, 4-7 years’ experience.
  • Production experience with agentic frameworks: LangGraph (preferred), Claude Agent SDK, or equivalent (not just prototypes)
  • Deep understanding of multi-agent architectures: supervisor/worker patterns, hierarchical agent graphs, ReAct loops, ReWoo
  • Hands-on with inter-agent communication protocols: MCP (Model Context Protocol), A2A, tool registry / server registry
  • LLM API integration at scale: structured outputs, streaming, function/tool calling, error handling
  • RAG pipeline design and optimization: chunking strategies, re-ranking, hybrid search - Know what knobs to turn for what issues
  • Vector store experience: OpenSearch or equivalent
  • Applied ML intuition: fine-tuning concepts, prompt engineering, evaluations, Qlora, PEFT
  • Backend development: FastAPI, gRPC, Kafka, Redis, message queues, Async
  • System design: Python, API Design
  • GraphQL and/or REST at enterprise scale
  • Observability and monitoring for non-deterministic systems: LangFuse, Prometheus, or equivalent
  • Kubernetes: deploying, scaling, and managing workloads (Deployments, Services, ConfigMaps, Secrets)
  • Container image management: building, tagging, versioning, and pushing images via Docker; familiarity with a container registry (ECR, GCR, Docker Hub)
  • CI/CD pipelines for automated build and deploy (GitHub Actions, Jenkins, ArgoCD, or similar)
  • Resource management: CPU/memory limits, autoscaling (HPA/VPA), health probes

Nice To Haves

  • Master’s desirable.
  • Multi-tenant architecture awareness: rate limiting, auth, tenant isolation
  • Knowledge base and cost optimization experience: AWS Bedrock, OpenSearch Serverless

Responsibilities

  • Design, build, and own a production-grade agentic orchestration platform, implementing scalable multi-agent workflows using frameworks such as LangGraph or equivalent.
  • Architect, develop, and operate the MCP server infrastructure, including inter-agent communication, tool/server registries, domain isolation, versioning, and lifecycle management.
  • Integrate and operate LLM services at enterprise scale, supporting streaming, structured outputs, tool/function calling, and robust error handling across agent workflows.
  • Build and maintain retrieval and memory services for agentic systems, including RAG pipelines, OpenSearch-backed vector stores, hybrid search, and relevance optimization.
  • Develop and operate high-performance backend services (FastAPI, gRPC, async systems, messaging) that power orchestration, tool execution, and agent runtime behavior.
  • Own observability and reliability for non-deterministic systems, delivering end-to-end tracing, monitoring, and cost/performance visibility for agent executions.
  • Manage cloud-native infrastructure and deployment, including Kubernetes workloads, containerized services, CI/CD pipelines, and resource optimization (CPU/memory, autoscaling).

Benefits

  • Health & Wellbeing We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.
  • Personal & Professional Development We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.
  • Unconditional Inclusion We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service