Senior, Software Engineer - AI Systems

WalmartBentonville, AR
Onsite

About The Position

We’re seeking a Software Engineer to design and build AI-first systems with a focus on agentic AI, high performance data/compute frameworks, and scalable, production-grade services. You’ll work across model-driven features and platform layers—integrating LLMs/agents, orchestrating pipelines with Ray, accelerating data science workloads with RAPIDS, and delivering robust APIs and services that power high-impact AI applications at scale. The ideal candidate blends strong software engineering fundamentals with practical ML systems exposure and a passion for performance, reliability, and developer experience.

Requirements

  • Bachelor's/Master's in CS, Engineering, or equivalent industry experience.
  • 4+ years building production backend or platform services (preferably in AI/ML contexts).
  • Proficiency in: Languages: Python (primary), plus one of Go/Java/C++ for performance services.
  • Proficiency in: Distributed frameworks: Ray, Spark, or Dask.
  • Proficiency in: Accelerated compute: RAPIDS (cuDF/cuML/cuGraph) and GPU-aware programming concepts (streams, memory).
  • Proficiency in: Service frameworks: FastAPI/Flask (Python), K8s (Kubernetes) and containerization (Docker).
  • Strong foundations in data structures/algorithms, concurrency, networking, and systems design.
  • Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 3 years’ experience in software engineering or related area.
  • Option 2: 5 years’ experience in software engineering or related area.

Nice To Haves

  • Production experience with agent frameworks (e.g., LangGraph-style planners, tool-use patterns, retrieval and memory components).
  • Experience with vector databases (e.g., FAISS, Milvus, pgvector, Pinecone) and feature stores.
  • Familiarity with LLM and embedding services, prompt/tooling patterns, and evaluation harnesses.
  • Hands-on with Kubernetes, autoscaling (HPA/KEDA), and GPU scheduling/operators.
  • Performance profiling: PyTorch profiler, Nsight, line-profiler, Ray dashboard.
  • Experience with vLLM, Triton Inference Server, ONNX Runtime, or TensorRT for high‑throughput inference.
  • Pragmatic problem solver with a bias for measurable outcomes (latency, throughput, reliability).
  • Excellent communicator able to translate between research goals and production constraints.
  • Drives clarity in ambiguous problem spaces; mentors others and uplifts engineering standards.
  • Background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly.
  • Knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.
  • Master’s degree in Computer Science, Computer Engineering, Computer Information Systems, Software Engineering, or related area and 1 year's experience in software engineering or related area.

Responsibilities

  • Build agentic AI services (planning, tool use, retrieval, feedback loops) and integrate them with internal systems and APIs.
  • Implement orchestration, memory, tooling, evaluation, and guardrails for agentic workflows.
  • Collaborate with DS/MLE partners to productionize models (LLMs, GNNs, embedding services) behind stable APIs and SDKs.
  • Develop GPU‑accelerated pipelines using RAPIDS (cuDF/cuML/cuGraph) and optimize end‑to‑end performance.
  • Use Ray (or similar) for distributed compute, batch/stream processing, and scalable workflow orchestration.
  • Profile and optimize bottlenecks across CPU/GPU, memory, and I/O layers; implement caching, vectorization, and async patterns.
  • Design and maintain reliable microservices for training/inference, vector indexing, and real-time decisioning.
  • Implement observability (tracing/metrics/logging), fault tolerance, auto-scaling, and cost-aware execution.
  • Create internal SDKs/CLIs to streamline developer workflows, testing, and reproducibility.
  • Establish CI/CD for AI services (unit/integration/e2e tests, canaries, blue/green, rollback).
  • Integrate with feature stores, vector databases, artifact registries, and model catalogs.
  • Enforce security, privacy, and compliance (data minimization, PII handling, governance, auditability).
  • Partner with product, platform, and DS/MLE teams to align requirements, SLAs, and success metrics.
  • Document systems thoroughly; contribute to design reviews and engineering best practices.
  • Mentor peers on AI systems patterns, distributed compute, and performance engineering.

Benefits

  • Incentive awards for performance
  • 401(k) match
  • Stock purchase plan
  • Paid maternity and parental leave
  • PTO
  • Multiple health plans
  • Competitive pay
  • Performance-based bonus awards
  • Company-paid life insurance
  • Family care leave
  • Bereavement
  • Jury duty
  • Voting leave
  • Short-term and long-term disability
  • Company discounts
  • Military Leave Pay
  • Adoption and surrogacy expense reimbursement
  • PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes
  • Walmart-paid education benefit program (Live better U) for full-time and part-time associates
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service