Senior Cloud Performance Engineer, Infrastructure

UberSunnyvale, CA
33d$180,000 - $200,000

About The Position

We are seeking a highly skilled and motivated Senior Cloud Performance Engineer to join Uber's Fleet Engineering organization. This role will lead the design, build, and operation of our next-gen cloud platform qualification and performance engineering. You'll drive end-to-end performance evaluations, build automation and tooling that accelerates decisions, and partner across infra, SRE, internal service and product teams, and cloud/xPU vendors to deliver measurable improvements at Uber scale. You'll work across multi-cloud, containerized, virtualized, and bare-metal systems to execute qualifications and in-depth performance analyses-tuning and improving observability across the stack, from hardware and Linux kernel runtimes to distributed application services. Beyond running benchmarks, you'll dive deep to debug performance anomalies, extend existing benchmarks, and develop new workloads that better capture Uber's production patterns. This role offers close collaboration with engineers across the stack, cloud partners, silicon vendors, and open-source communities to shape the next generation of Uber's infrastructure. It's ideal for someone with a hardware-software co-design background who thrives on uncovering performance insights and building systems that make those insights actionable.

Requirements

  • 5+ years software engineering or systems/performance engineering experience (BS in CS/EE or related), with demonstrated end-to-end ownership of complex projects.
  • Proficient in two or more: Go, Python, Java, C/C++; strong CS fundamentals and testing/automation discipline.
  • Hands-on with Linux internals (CPU scheduling, memory, I/O, networking) and perf tooling (perf, eBPF, flamegraphs, tracing frameworks).
  • Experience with Docker/Kubernetes, microservices, and distributed systems; comfort building production services and pipelines.
  • Proven track record of clear communication, writing design docs/postmortems, and leading cross-functional efforts.

Nice To Haves

  • Experience tuning databases, stream processing, batch or ML platforms (e.g. PyTorch, JAX).
  • Familiarity with microservices debugging and distributed tracing (OpenTelemetry, Jaeger).
  • Performance tuning for databases/streaming/batch/ML platforms; GPU/xPU or Arm performance exposure.
  • Experience building observability (OpenTelemetry/Jaeger), CI/CD perf gates, and regression detectors.
  • Large-scale fleet know-how: OS imaging/provisioning, config rollout, hardware health monitoring, and DC networking fundamentals.
  • Full-stack bonus: backend (e.g., MySQL) and light UI work (React/Tableau) for results dashboards

Responsibilities

  • Design and lead the architecture and development of Uber's benchmarking and qualification platform, automating test orchestration, data collection, analysis, and reporting across multi-cloud and on-prem environments.
  • Develop and extend workloads and benchmarks (compute, storage, network, ML/AI) and integrate stress, chaos, and regression tests to validate hardware and platform choices.
  • Analyze and optimize end-to-end performance across hardware, firmware, Linux kernel, runtimes, and distributed services using advanced profiling tools (perf, eBPF, flamegraphs, tracing frameworks).
  • Build automation and observability tooling (Go/Python/Java, Kubernetes/Docker) for CI/CD-based performance regression detection, telemetry, alerting, and anomaly detection.
  • Collaborate with hyperscalers and silicon partners (Arm, GPU, and accelerator vendors) to evaluate emerging instances, kernels, and infrastructure technologies, and translate findings into roadmap recommendations.
  • Influence system architecture and tooling decisions that improve how Uber builds, monitors, and scales its infrastructure.
  • Drive execution and quality, writing design docs, setting milestones, mentoring ICs, and communicating insights and results to stakeholders and leadership.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Transit and Ground Passenger Transportation

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service