Director, Software Engineering

DocusignSeattle, WA
Hybrid

About The Position

Docusign operates global, always-on services. To safeguard customer trust and empower our engineering teams, we are building the next generation of monitoring and troubleshooting capabilities.] You will lead the engineering organization behind our AI-powered observability platform. Built to complement existing tools like Grafana, Prometheus, and Clickhouse/Azure Data Explorer , this platform addresses the challenges of information overload by making observability accessible to everyone. You will manage a multidisciplinary team of backend engineers, machine learning engineers, and data scientists to build predictive, automated insights while running a highly critical service with strict SLAs. This positon is a apeople manager role reporting to the Senior Director, Software Engineering.

Requirements

  • 12+ years in software/infra/platform engineering, including 6+ years operating high-scale, 24x7 back-end systems.
  • 3+ years managing managers and senior ICs, specifically leading multidisciplinary teams across domains like backend engineering and machine learning.
  • Proven ownership of mission-critical services with strict SLOs, incident management, and change control.
  • Experience hiring, developing, and retaining senior/principal talent and building inclusive, high-trust teams.

Nice To Haves

  • Familiarity with modern observability practices (metrics, logs, distributed tracing) and standards such as OpenTelemetry
  • Experience leading teams that build, train, and deploy machine learning models in production environments for AIOps, anomaly detection, or predictive analytics
  • Experience improving the performance and reliability of large telemetry/data pipelines at a multi-region scale
  • Exposure to multi-cloud environments and Kubernetes-based platforms

Responsibilities

  • Manage high-throughput, real-time observability pipelines processing massive volumes of telemetry data across multi-region, multi-cloud environments
  • Operate a Tier-0 data plane with strict SLOs, disciplined change management, and high availability requirements
  • Serve as the observability backbone for every engineering team's reliability and velocity, dramatically reducing the time from "something's wrong" to "here's the problem"
  • Set a clear 12-24 month vision for AIOps capabilities, focusing on automated troubleshooting workflows, proactive anomaly detection, and smart alert aggregation
  • Bridge the gap between robust backend infrastructure and applied machine learning, ensuring models for auto-threshold estimation and pattern recognition are effectively trained and reliably deployed at scale
  • Drive availability, durability, incident readiness, and disaster recovery for the observability plane; run regular resilience drills
  • Lead, hire, and grow a senior-heavy team of backend engineers, ML engineers, and applied scientists. Build an architecture culture, clear career paths, and a high-judgment, high-ownership operating model
  • Own the annual operating plan for the AIOps platform capacity, availability, and budget, meeting or beating committed SLOs/SLAs
  • Drive the development of intelligent capabilities like automated impact analysis, ML-driven threshold estimation, and natural language interfaces to reduce alert noise and accelerate debugging
  • Continually improve ingestion latency, query performance, storage efficiency, and cost per unit while maintaining reliability through traffic spikes and deploys
  • Partner with SRE, Telemetry Platform, Security, Finance, and Product; make pragmatic build-vs-buy decisions; manage vendors and capacity commitments
  • Lead on-call and incident command for the observability platform

Benefits

  • Bonus: Sales personnel are eligible for variable incentive pay dependent on their achievement of pre-established sales goals. Non-Sales roles are eligible for a company bonus plan, which is calculated as a percentage of eligible wages and dependent on company performance.
  • Stock: This role is eligible to receive Restricted Stock Units (RSUs).
  • Paid Time Off: earned time off, as well as paid company holidays based on region
  • Paid Parental Leave: take up to six months off with your child after birth, adoption or foster care placement
  • Full Health Benefits Plans: options for 100% employer paid and minimum employee contribution health plans from day one of employment
  • Retirement Plans: select retirement and pension programs with potential for employer contributions
  • Learning and Development: options for coaching, online courses and education reimbursements
  • Compassionate Care Leave: paid time off following the loss of a loved one and other life-changing events
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service