Principal Software Engineer

DocusignUNAVAILABLE, UNAVAILABLE
Remote

About The Position

Docusign is at a pivotal moment. We are transforming from the world's leading eSignature tool into an Intelligent Agreement Management (IAM) platform. This shift requires more than just "stable" infrastructure; it requires a revolutionary approach to reliability. We are looking for a Principal SRE who doesn't just manage systems, but transforms them. You are an Engineering-First visionary who believes that the only way to achieve 99.99% reliability at scale is through code, not humans. You will be tasked with evolving our SRE organization from a traditional support-focused model into a lighthouse of engineering excellence that leads the company's technical strategy. This position is an individual contributor role reporting to the VP, Engineering - Site Reliability.

Requirements

  • 15+ years of experience in large-scale distributed systems, software engineering, or infrastructure roles, with a track record of driving system architecture
  • Experience as a software engineer by trade with deep proficiency in Go or Python, possessing a "code-first" approach and a passion for writing production-grade automation services alongside the engineering team
  • Experience with proven technical leadership in building global, active-active distributed systems at hyperscale, functioning simultaneously as an architect and an engineering peer
  • Experience with production-hardened mastery of Kubernetes and Terraform to manage complex, multi-tenant cloud topographies
  • Experience acting as a primary Lead Incident Commander for tier-0 global outages, with the ability to translate operational chaos into actionable technical stabilization
  • Experience defining "Developer Experience" strategies and contributing to Internal Developer Platforms (IDPs) that bake resilience and infrastructure abstractions directly into developer workflows

Nice To Haves

  • Technical expertise executing high-stakes on-premises to cloud migrations natively within Microsoft Azure (specifically utilizing Azure Kubernetes Service / AKS and Azure traffic routing)
  • Hands-on experience architecting global distributed tracing capabilities using the OpenTelemetry ecosystem to track deep, user-centric SLO metrics across microservices
  • Experience developing self-healing infrastructure patterns through a blend of deterministic code and AI-assisted/predictive anomaly remediation models
  • Experience championing and setting up automated fault-injection frameworks to proactively prove system recoverability before a real production blast radius occurs
  • Experience building safe deployment architectures (Canary, Blue/Green) managed via secure pipelines (GitHub Actions, Azure DevOps) with automated safety policies embedded directly into the code lifecycle

Responsibilities

  • Lead and code with the team
  • Lead the cultural and technical shift toward treating reliability as a product feature
  • Move the org away from reactive "ops" work toward building durable platforms and self-healing systems
  • Possess elite Incident Commander skills while not expected to be in the daily on-call rotation, stepping in during high-stakes outages to bring calm and clarity, and use those experiences to architect systems that ensure those incidents never happen again
  • Define the "Golden Paths" for our Cloud migration, ensuring that as Docusign scales globally, our architecture remains "Multi-Active" and impervious to regional cloud failures
  • Challenge the status quo, mentoring Senior and Staff SREs to think like software architects
  • Advocate for "Error Budgets" that have real teeth, influencing product roadmaps to prioritize long-term stability

Benefits

  • Bonus: Sales personnel are eligible for variable incentive pay dependent on their achievement of pre-established sales goals. Non-Sales roles are eligible for a company bonus plan, which is calculated as a percentage of eligible wages and dependent on company performance.
  • Stock: This role is eligible to receive Restricted Stock Units (RSUs).
  • Paid Time Off: earned time off, as well as paid company holidays based on region
  • Paid Parental Leave: take up to six months off with your child after birth, adoption or foster care placement
  • Full Health Benefits Plans: options for 100% employer paid and minimum employee contribution health plans from day one of employment
  • Retirement Plans: select retirement and pension programs with potential for employer contributions
  • Learning and Development: options for coaching, online courses and education reimbursements
  • Compassionate Care Leave: paid time off following the loss of a loved one and other life-changing events
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service