Principal Software Development Engineer (Kubernetes, AWS)

Expedia GroupSan Jose, CA
Hybrid

About The Position

Expedia Group brands power global travel for everyone, everywhere. We design cutting-edge tech to make travel smoother and more memorable, and we create groundbreaking solutions for our partners. Our diverse, vibrant, and welcoming community is essential in driving our success. Our Technology Team partners with teams across Expedia Group to create innovative products, services, and tools to deliver high-quality experiences for travelers, partners, and our employees. A singular technology platform powered by data and machine learning provides secure, differentiated, and personalized experiences that drive loyalty and traveler satisfaction. We’re seeking a motivated Principal Software Development Engineer with a passion for technology, problem solving, and out-of-the-box thinking to be part of our Runtime Team. Our team is responsible for building a container platform with a suite of capabilities to enable our developers to rapidly deploy and scale containerized workloads.

Requirements

  • 8+ years of experience in infrastructure automation, configuration management or container orchestration.
  • Bachelor’s or Master’s degree in a related technical field, or equivalent professional experience.
  • Strong programming skills in one or more languages: Java, Go, Python or Ruby.
  • Experience in cloud computing with Amazon Web Services (AWS) and containerization with Docker and Kubernetes/EKS.

Nice To Haves

  • Experience with Stateless and Stateful workloads, Service Mesh or Service Discovery, Monitoring, Alerting and Logging.
  • Understanding of security development principles such as token management, encryption, and certificates.
  • Experience with Continuous Integration tools like Jenkins or similar.
  • Experience building self-service technology platform capabilities, particularly in the container compute, traffic management, or API management spaces.
  • Experience mentoring other engineers and establishing standards for operational excellence and code quality at a multi-project level.

Responsibilities

  • Play a key role in crafting the strategic technical goals for our group.
  • Lead the architecture, design and building a compute runtime platform based on Kubernetes that will be used by all engineering teams across Expedia.
  • Provide technical leadership for a dynamic and growing engineering organization.
  • Work alongside a talented group of product managers and other technical leaders to deliver best-in-class capabilities to our Expedia developer community, and as a result help shape the future of online travel.
  • Design and Implement Core Platform Components: Evolve our Kubernetes-based environment, focusing on areas like multi-tenancy, network policy, resource management, and service mesh integration (e.g., Istio, Linkerd).
  • Architect for Scale and Reliability: Lead the technical design for scaling our control plane and data plane to handle a 10x increase in services and traffic.
  • Define and implement SLOs for the platform itself.
  • Improve the Developer Control Plane: Design and build the next generation of our CI/CD pipelines and GitOps workflows.
  • Drive the strategy for our internal developer portal (e.g., Backstage) to unify tooling, documentation, and service lifecycle management.
  • Automate Infrastructure Lifecycle: Author and maintain production-grade Infrastructure as Code (IaC) using Terraform and/or Crossplane.
  • Eliminate manual toil by automating cluster provisioning, node lifecycle, and dependency upgrades.
  • Technical Leadership and Mentorship: Act as a force multiplier. Mentor senior engineers on the team, lead architecture review sessions, and author RFCs to build consensus on significant technical decisions. Your influence will extend beyond the team to application developers and SREs.
  • Production Debugging: Serve as the final escalation point for complex, cross-cutting production incidents that involve the underlying platform, from kernel-level issues to CNI bugs to distributed system failures.
  • Collaborate across product management, architecture, and engineering leads to deliver capabilities that enable our developer community to function at a high capacity.
  • Advocate for operational excellence (such as unit testing, establishing SLAs, programming for resiliency and scalability).
  • Take ownership of high stress scenarios by remaining calm, employing critical thinking and data driven decision-making practices.

Benefits

  • medical/dental/vision
  • paid time off
  • Employee Assistance Program
  • wellness & travel reimbursement
  • travel discounts
  • International Airlines Travel Agent (IATAN) membership
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service