Senior Site Reliability Engineer

PayPalSan Jose, CA
Hybrid

About The Position

At PayPal, Senior Site Reliability Engineers (SREs) drive the reliability, performance, and availability of our global mobile and backend systems. As part of our new Mobile SRE team, you’ll bridge the gap between iOS and Android clients and the backend services that power them, delivering seamless experiences for millions of customers. In this hands-on role, you’ll implement reliability standards, build automation, and enhance observability across the stack. By developing actionable insights, automating key workflows, and advancing operational excellence, you’ll help ensure PayPal’s platforms deliver reliable, high-performance experiences customers can trust worldwide.

Requirements

  • 3+ years relevant experience and a Bachelor’s degree OR Any equivalent combination of education and experience.
  • Proven experience in Site Reliability Engineering, software development, or systems engineering, with a focus on end-to-end system reliability and performance.
  • Strong understanding of backend architectures, including APIs, data flows, and cross-system dependencies.
  • Hands-on experience developing monitoring, observability, and alerting solutions using tools such as Datadog, Firebase Crashlytics, or Sentry.
  • Skilled in automation and tooling development using Python, Go, or similar languages to reduce manual processes and improve efficiency.
  • Experience implementing SLIs/SLOs and leveraging metrics to drive measurable improvements in reliability and availability.
  • Solid foundation in distributed systems, cloud infrastructure (AWS, GCP, or Azure), and CI/CD pipelines for reliable software delivery.
  • Strong debugging and problem-solving skills, capable of diagnosing and resolving complex issues across mobile, API, and backend systems.
  • Effective collaborator and communicator, skilled at partnering across mobile, backend, and SRE teams to deliver cohesive reliability outcomes.
  • Demonstrated ability to mentor engineers and foster a culture of observability, automation, and operational excellence.

Nice To Haves

  • Understanding of mobile (iOS and Android) applications
  • Experience improving incident response workflows, postmortem and on-call models processes.
  • Background in performance optimization, fault tolerance, and disaster recovery for large-scale systems.
  • Experience collaborating within distributed or global engineering teams.

Responsibilities

  • Take ownership of system performance monitoring, identify inefficiencies, and lead initiatives to improve the overall availability and reliability of digital platforms and applications.
  • Lead and manage the response to complex, high-priority incidents, ensuring prompt resolution and a thorough root cause analysis to prevent future occurrences.
  • Design and implement advanced automation frameworks to improve operational efficiency, streamline processes, and reduce human error.
  • Lead reliability-focused initiatives, ensuring systems are highly available, resilient, and scalable, and promote best practices across engineering teams.
  • Enhance the monitoring infrastructure by identifying key metrics, optimizing alerting, and improving system observability to ensure the reliability of large-scale systems.
  • Forecast resource requirements and lead capacity planning activities to ensure systems can scale effectively to meet growing user demand.
  • Ensure robust disaster recovery strategies are in place and conduct regular testing to ensure systems can recover quickly from failures.
  • Partner with engineering and product teams to identify opportunities for improving system architecture, focusing on scalability, reliability, and fault tolerance.
  • Provide mentorship and technical guidance to junior site reliability engineers, fostering skill development and knowledge sharing.
  • Drive continuous improvement across operational workflows, identifying areas for optimization, cost reduction, and performance enhancement.

Benefits

  • At PayPal, we’re committed to building an equitable and inclusive global economy.
  • That’s why we offer comprehensive, choice-based programs, to support all aspects of personal wellbeing—physical, emotional, and financial—delivering meaningful value where it matters most.
  • We strive to create a flexible, balanced work culture with a holistic approach to benefits, including generous paid time off, healthcare coverage for you and your family, and resources to create financial security and support your mental health.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service