About The Position

We’re looking for a Software Developer Specialist to join our Disaster Recovery Governance Team. In this role, you’ll help shape how we define, implement, and continuously improve the resilience of our systems and platforms across the organization. You'll fit right in if you enjoy solving complex technical challenges, working across teams and systems, and building solutions that hold up when it matters most!

Requirements

  • Bachelor’s degree in Computer Science, Computer Engineering, or a related field, or equivalent experience.
  • 4+ years experience designing, building, and supporting distributed systems, with a strong understanding of failure modes, system dependencies, and resiliency patterns (e.g., retries, graceful degradation, recovery-ready design).
  • Solid grounding in disaster recovery principles, including failover strategies, recovery planning, and measurable recovery objectives (RTO/RPO), with experience supporting or participating in regular DR testing.
  • Hands-on experience in AWS cloud environments, with familiarity across infrastructure, application layers, and data processing systems.
  • Experience building or contributing to automation and tooling that supports reliability, testing, or operational workflows (e.g., readiness checks, repeatable recovery tests, failover/rollback execution).
  • Strong collaboration and communication skills, with the ability to translate complex technical concepts into clear documentation, runbooks, and shared standards across teams.
  • Familiarity with monitoring, alerting, and operational readiness practices, including backup and data protection concepts, restore validation, and infrastructure-as-code for repeatable environments.

Nice To Haves

  • Experience supporting disaster recovery tests and resilience “game day” exercises (including controlled fault-injection experiments).
  • Familiarity with high-availability architectures across multiple environments or regions, ideally in regulated, reliability-focused, or globally distributed environments.
  • Experience developing in Java or Python is considered an asset.

Responsibilities

  • Design and implement disaster recovery solutions across services and platforms, ensuring systems meet defined recovery and resiliency standards.
  • Develop and evolve disaster recovery governance frameworks that establish the standards, processes, and capabilities needed to build a scalable enterprise disaster recovery platform.
  • Collaborate with engineering, infrastructure, and operations teams to understand system dependencies and develop effective failover strategies across environments.
  • Build and maintain automation and tooling to support recovery testing, failover execution, and environment readiness.
  • Participate in disaster recovery tests and simulations, contributing to execution, validation, and follow-through on identified improvements.
  • Document recovery procedures, system dependencies, and test outcomes to support operational readiness and audit requirements.

Benefits

  • Hybrid work environment
  • Total rewards program that covers every dimension of life—from building wealth and growing your career to prioritizing well-being and family care.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service