About The Position

Our Enterprise Technology team is at the forefront of driving reliability, performance, and exceptional user experiences across Staples’ business-critical platforms. We partner closely with Engineering, SRE, Product, and Infrastructure teams to deliver integrated observability solutions and ensure the highest standards of operational excellence. As we continue to scale our enterprise systems, we seek innovative leaders ready to champion new technologies and standardize best-in-class practices across our organization. The Enterprise Tools Engineering Architect will lead the architecture, design, and evolution of Staples’ observability platforms, with primary focus on Application Performance Monitoring (APM), infrastructure monitoring, session replay, and log analytics. This role will define and implement scalable, integrated observability solutions that provide end-to-end visibility across application, infrastructure, and customer experience layers, driving platform standardization and alignment with our business priorities.

Requirements

  • Deep expertise designing and administering APM platforms (such as New Relic, Dynatrace, or Datadog), with strong experience in distributed tracing, transaction monitoring, and service dependency mapping.
  • Strong background with infrastructure monitoring platforms (e.g., Zabbix, Prometheus, Azure Monitor), and deep understanding of system-level telemetry across compute, network, storage, and hybrid cloud environments.
  • Experience with session replay and user experience platforms (e.g., FullStory) and the ability to correlate user behavior with backend performance and system health.
  • Proven experience with log analytics platforms (e.g., Splunk, Elastic), including data ingestion, routing, indexing strategies, and retention management.
  • Ability to design cross-platform observability architecture, integrating APM, infrastructure, logs, and session data with cloud platforms, CI/CD pipelines, and enterprise systems.
  • Proficiency in scripting and automation (Python, Bash) for platform configuration, scaling, and governance framework definition.
  • Strong leadership and cross-functional collaboration skills, with demonstrated ability to drive strategy, roadmaps, and platform standardization.
  • Excellent vendor management skills and the ability to influence technical direction and resolve challenges efficiently.
  • Bachelor’s degree in Computer Science, Engineering, or a directly related field of study.
  • 10+ years of progressive experience in observability, APM, and infrastructure monitoring, with demonstrated architecture-level ownership.

Nice To Haves

  • Certifications in one or more observability platforms (e.g., New Relic, Dynatrace, Datadog, Splunk Architect/Consultant, FullStory).
  • Experience with AI/ML-driven observability, anomaly detection, or AIOps platforms.
  • Familiarity with generative AI or agentic frameworks applied to observability workflows.
  • Experience in multi-vendor observability environments and tool rationalization efforts.
  • Strong understanding of SRE practices including SLOs, error budgets, and reliability engineering.
  • Experience supporting large-scale, high-traffic B2C and B2B platforms.

Responsibilities

  • Lead the architecture, design, and continuous improvement of enterprise observability platforms, focusing on APM, infrastructure monitoring, session replay, and log analytics.
  • Define and implement scalable and integrated observability solutions that deliver end-to-end visibility across applications, infrastructure, and user experience layers to enhance system reliability, performance, and service quality.
  • Establish architecture patterns, governance models, and integration frameworks to enable consistent telemetry, correlation, and actionable insights across all Staples systems.
  • Partner with Engineering, SRE, Product, and Infrastructure teams to embed observability best practices into the software development lifecycle, improving mean time to detect (MTTD), mean time to resolve (MTTR), and overall platform resilience.
  • Lead the evaluation and adoption of new technologies—including distributed tracing, OpenTelemetry, and AI-driven observability—ensuring alignment with enterprise strategy and cost optimization goals.
  • Drive platform standardization, signal quality, and integration with business-critical services to support operational excellence and user satisfaction.
  • Influence vendor roadmaps, resolve vendor issues efficiently, and provide strategic leadership across multiple teams and domains.

Benefits

  • Inclusive culture with associate-led Business Resource Groups
  • 22 days of PTO and Holiday Schedule (7 observed paid holidays + 1 floating holiday)
  • Online and Retail Discounts
  • Company Match 401(k)
  • Physical and Mental Health Wellness programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service