Principal Software Engineer - AI Systems

Wal-MartSunnyvale, CA
40d$143,000 - $286,000

About The Position

What you'll do What you'll do 1. AI Systems Engineering Design and implement large-scale, production-grade AI systems that integrate LLMs and Generative AI into real-world applications. Build frameworks that support Retrieval-Augmented Generation (RAG), agentic workflows, and multi-step reasoning at scale. Ensure models and agents are production-ready with strong observability, monitoring, and performance optimization. 2. Architecture & Scalability Architect distributed, fault-tolerant systems capable of supporting high-throughput AI workloads. Lead the design of modular, extensible, and reusable components to accelerate AI adoption across teams. Build MVPs quickly, validate assumptions, and iterate toward scalable long-term solutions. 3. Integration & Delivery Partner with product and platform teams to integrate AI into customer-facing and enterprise-grade applications. Define and enforce standards for APIs, services, and infrastructure that enable seamless AI adoption. Balance functional requirements with non-functional goals such as reliability, latency, and security. 4. Leadership & Mentorship Drive technical strategy for AI initiatives and guide teams in best practices for AI-driven software development. Mentor engineers across software and AI domains to elevate overall technical expertise. Contribute to thought leadership in AI engineering through internal frameworks, design patterns, and re-usable components. What you'll bring 12+ years of experience in software engineering (backend, distributed systems, large-scale platforms), with 2+ years applying Generative AI/LLMs in production. Proven expertise in distributed computing, cloud-native architectures (GCP, Azure, or AWS), and systems that prioritize scalability and fault tolerance. Strong coding skills in Python (preferred) and at least one system-level language (Java, Go, or C++). Experience with ML/AI frameworks (PyTorch, TensorFlow, Hugging Face) as a plus, but applied in the context of building systems, not just training models. Deep knowledge of RAG pipelines, vector databases, and real-time data integration. Familiarity with resilience engineering: disaster recovery, failover, monitoring, and high availability. Exposure to multi-modal AI (text, image, video) and optimization techniques (quantization, distillation) is advantageous. Strong grounding in system design, performance engineering, and design patterns. Track record of delivering production systems with AI at scale, not just research or prototyping.

Requirements

  • 12+ years of experience in software engineering (backend, distributed systems, large-scale platforms), with 2+ years applying Generative AI/LLMs in production.
  • Proven expertise in distributed computing, cloud-native architectures (GCP, Azure, or AWS), and systems that prioritize scalability and fault tolerance.
  • Strong coding skills in Python (preferred) and at least one system-level language (Java, Go, or C++).
  • Deep knowledge of RAG pipelines, vector databases, and real-time data integration.
  • Familiarity with resilience engineering: disaster recovery, failover, monitoring, and high availability.
  • Strong grounding in system design, performance engineering, and design patterns.
  • Track record of delivering production systems with AI at scale, not just research or prototyping.
  • Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 5 years' experience in software engineering or related area.
  • Option 2: 7 years' experience in software engineering or related area.

Nice To Haves

  • Experience with ML/AI frameworks (PyTorch, TensorFlow, Hugging Face) as a plus, but applied in the context of building systems, not just training models.
  • Exposure to multi-modal AI (text, image, video) and optimization techniques (quantization, distillation) is advantageous.
  • Master's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 3 years' experience in software engineering or related area.
  • We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly.
  • The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart's accessibility standards and guidelines for supporting an inclusive culture.

Responsibilities

  • AI Systems Engineering: Design and implement large-scale, production-grade AI systems that integrate LLMs and Generative AI into real-world applications.
  • Build frameworks that support Retrieval-Augmented Generation (RAG), agentic workflows, and multi-step reasoning at scale.
  • Ensure models and agents are production-ready with strong observability, monitoring, and performance optimization.
  • Architecture & Scalability: Architect distributed, fault-tolerant systems capable of supporting high-throughput AI workloads.
  • Lead the design of modular, extensible, and reusable components to accelerate AI adoption across teams.
  • Build MVPs quickly, validate assumptions, and iterate toward scalable long-term solutions.
  • Integration & Delivery: Partner with product and platform teams to integrate AI into customer-facing and enterprise-grade applications.
  • Define and enforce standards for APIs, services, and infrastructure that enable seamless AI adoption.
  • Balance functional requirements with non-functional goals such as reliability, latency, and security.
  • Leadership & Mentorship: Drive technical strategy for AI initiatives and guide teams in best practices for AI-driven software development.
  • Mentor engineers across software and AI domains to elevate overall technical expertise.
  • Contribute to thought leadership in AI engineering through internal frameworks, design patterns, and re-usable components.

Benefits

  • At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more.
  • You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable.
  • Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart.
  • The annual salary range for this position is $143,000.00-$286,000.00
  • Additional compensation includes annual or quarterly performance bonuses.
  • Stock

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Principal

Industry

General Merchandise Retailers

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service