About The Position

Who are we? Equinix is the world’s digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet. A place where tech thinkers and future builders turn bold ideas into breakthrough experiences, we welcome your unique perspective. Help us challenge assumptions, uncover bias, and remove barriers—because progress starts with fresh ideas. You’ll find belonging, purpose, and a team that welcomes you—because when you feel valued, you’re empowered to do your best work. Job Summary The Platform Tools & Delivery (PTD) organization is the unified platform engineering team within Core Product Services (CPS). We are responsible for the secure, scalable, and consistent delivery of Equinix's digital products. This role leads the technical vision and consolidation of observability signals and reliability standards across Equinix's global hybrid footprint for the engineering teams that build and run Equinix’s infrastructure, products, and services.

Requirements

  • 10+ years in Platform Engineering, Site Reliability Engineering (SRE), or Observability-focused roles
  • Bachelor’s in Computer Science, Computer Engineering, or a related technical field
  • Expert-level knowledge of Platform Engineering, Grafana Cloud, Observability concepts (Logs, Metrics, Traces, RUM, Synthetics, etc), and Operational Readiness.
  • Competence with Kubernetes, ArgoCD, on-premises and cloud infrastructure (AWS), software engineering practices including CI/CD.

Nice To Haves

  • Familiarity with Go development, cluster-api and the CNCF ecosystem is preferred

Responsibilities

  • Interacts with internal product management and engineering teams to understand product requirements and define the platform roadmap
  • Works with the Equinix Engineering Excellence (E3) team in the Equinix IT organization to find common points of acceleration and bidirectional consumption of services
  • Acts as a lead representative for Infrastructure P&S requirements in forums for enterprise-wide developer initiatives, plans, and architectures
  • Defines the platform reliability standards through the development of a comprehensive SLO/SLI framework
  • Drives architectural consistency for observability across a hybrid footprint including 31 metros and multiple AWS regions
  • Consolidates all application observability signals onto a single platform (Grafana Cloud) to provide a single source of truth
  • Provides technical leadership for the design of the "Paved Path" regarding application assurance and reliability signals
  • Evaluates and recommends the consolidation of disparate, non-unified observability tools and parallel support systems in favor of unified, strategic solutions
  • Designs integration strategies for identity and access management to ensure secure developer access to platform tools
  • Participates in the development of automated reliability signals and self-service observability tools
  • Drives project work and creates automation for the observability stack and application lifecycle tools
  • Participates in peer reviews and technical integration efforts to ensure cross-functional alignment within the PTD and CPS organizations
  • Sets standards for application assurance, including vulnerability management and identity integration programs
  • Recommends frameworks for measuring platform performance, such as Kubernetes API server uptime and provisioning delivery time
  • Articulates the vision for a unified runtime that leverages both global on-premises footprints and cloud capabilities
  • Leads the Observability Stack Unification charter as part of the broader CI/CD and platform consolidation effort
  • Utilizes FinOps and financial observability reporting to provide cost attribution by product, team, and organization
  • Defines and publishes critical reliability metrics, including Mean Time to Detect (MTTD) and Mean Time to Repair (MTTR)
  • Provides L4 technical escalation capacity to stabilize critical, high-toil services
  • Participates in on-call rotations for respective observability and operations areas to ensure 24/7 platform stability
  • Serves as a technical liaison for internal product teams (the platform's customers) to understand concerns and priorities
  • Acts as a primary point of contact for technical perspectives and alignment with stakeholders in the Equinix product organization and the Equinix IT organization
  • Works with Engineering Managers to define platform KPIs and project schedules for unification efforts
  • Provides status reporting on the Observability Standard and other strategic consolidation projects
  • Investigates and evaluates new observability technologies to reduce infrastructure toil for product teams
  • Influences the organization’s technical objectives by identifying fruitful opportunities in areas like telemetry and proactive alerting

Benefits

  • As an employee, you become important to Equinix’s success. We ensure all your benefits are in line with our core values: competitive, inclusive, sustainable, connected and efficient. We keep them competitive within the current marketplace to ensure we’re providing you with the best package possible. So, wherever you are in your career and life, you’ll be able to enhance your experience and bring your whole self to work.
  • Employee Assistance Program: An Employee Assistance program is available to all employees.
  • US Benefits: - Insurance: You may enroll in health, life, disability and voluntary plans that are designed for you and your eligible family members. - Retirement: You and Equinix may contribute to a retirement plan to help you plan for your financial future. - Paid Time Off (PTO) and Paid Holidays: You will receive an accrued amount of PTO each pay period along with various paid holidays for you to rest and recharge. Eligibility requirements apply to some benefits. Benefits are subject to change and may be subject to specific plan or program terms.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service