Manager, Software Engineering - Observability

FigmaSan Francisco, CA
6h$250,000 - $350,000Remote

About The Position

Figma is growing our team of passionate creatives and builders on a mission to make design accessible to all. Figma’s platform helps teams bring ideas to life—whether you're brainstorming, creating a prototype, translating designs into code, or iterating with AI. From idea to product, Figma empowers teams to streamline workflows, move faster, and work together in real time from anywhere in the world. If you're excited to shape the future of design and collaboration, join us! Figma’s Observability engineering team builds and operates the systems that give us deep visibility into the health, performance, and efficiency of our platform. From metrics, logs, and traces to cost attribution and budgeting, this team ensures that engineers across Figma can detect issues quickly, understand system behavior at scale, and make informed decisions about reliability and spend. The team owns and evolves our core observability stack—including platforms like Datadog, shared instrumentation libraries, and the agents and operators that power telemetry collection—while continuously raising the bar on signal quality and operational clarity. As the Engineering Manager for Observability, you will lead a team of five engineers responsible for shaping the future of visibility and efficiency at Figma. You’ll define the strategy for instrumentation standards and cost transparency, drive initiatives to optimize observability footprint and spend, and explore innovative AI-driven approaches to anomaly detection and operational automation. This role is well-suited for a leader with strong distributed systems experience who is motivated by platform leverage, cross-functional impact, and building systems that enable every engineering team to operate with confidence and precision. This is a full time role that can be held from one of our US hubs or remotely in the United States.

Requirements

  • 4+ years of experience leading infrastructure, observability, or platform engineering teams, with a track record of delivering highly reliable production systems
  • Deep hands-on experience with modern observability platforms (e.g., Datadog, OpenTelemetry) across metrics, logs, and distributed tracing
  • Strong understanding of distributed systems, instrumentation best practices, SLO design, and incident response workflows
  • Experience driving cost transparency and accountability initiatives, including cost attribution, budgeting, forecasting, and alerting in cloud environments
  • Demonstrated ability to set technical direction, drive cross-functional alignment (Engineering, Finance, Security), and make sound architectural decisions in complex environments

Nice To Haves

  • Experience designing or evolving company-wide observability standards, shared libraries, and agent/operator-based integrations
  • Background in cost optimization for infrastructure or observability tooling, including vendor negotiations and usage modeling
  • Experience applying AI or machine learning techniques to anomaly detection, root cause analysis, or operational automation
  • Familiarity with OpenTelemetry and modern instrumentation frameworks across multiple programming languages
  • Experience scaling and mentoring high-performing engineering teams through platform expansion or significant architectural change

Responsibilities

  • Lead and grow a team of engineers responsible for the reliability, scalability, and evolution of Figma’s observability and cost engineering platforms
  • Own and operate Figma’s core observability stack, including vendor platforms such as Datadog, ensuring high availability, strong data quality, and effective signal-to-noise across metrics, logs, and traces
  • Define and drive the technical strategy for instrumentation standards, observability libraries, agents, and operators used to monitor internal and external facing services
  • Explore and implement innovative, AI-driven approaches to anomaly detection, root cause analysis, signal correlation, and operational automation
  • Establish clear frameworks for cost attribution, budgeting, forecasting, and alerting across infrastructure and observability spend, enabling teams to make informed tradeoffs
  • Partner with infrastructure, product engineering, finance, and security teams to improve visibility into system health and cost efficiency at scale
  • Lead initiatives to optimize observability footprint and spend, balancing depth of insight with performance and cost considerations
  • Coach and mentor engineers through career development, performance feedback, and technical leadership, fostering a culture of ownership, collaboration, and high quality execution

Benefits

  • Figma offers equity to employees, as well a competitive package of additional benefits, including health, dental & vision, retirement with company contribution, parental leave & reproductive or family planning support, mental health & wellness benefits, generous PTO, company recharge days, a learning & development stipend, a work from home stipend, and cell phone reimbursement.
  • Figma also offers sales incentive pay for most sales roles and an annual bonus plan for eligible non-sales roles.
  • Figma’s compensation and benefits are subject to change and may be modified in the future.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service