Disaster Recovery & COOP Engineer

PeratonHerndon, VA
1d

About The Position

We are seeking a highly skilled and innovative Disaster Recovery & COOP Engineer to join our team in the greater DMV area, supporting the Army National Guard. Responsibilities Architect, develop, and maintain enterprise disaster recovery (DR) and Continuity of Operations (COOP) capabilities for systems, applications, and cybersecurity platforms. Design COOP plans, backup/recovery procedures, contingency playbooks, failover architectures, and recovery point/time objectives aligned with RMF, DoD, Army, and ARNG directives. Maintain COOP environments, coordinate failover testing and contingency exercises, and validate recovery procedures to ensure continuity of essential services during disruptions. Integrate redundancy, continuity, and recovery requirements into system and network designs to ensure resilience against cyber incidents, infrastructure failures, and mission disruptions. Collaborate with cybersecurity engineering, SOC/CIRT, enterprise architects, and system owners to assess resilience, identify continuity gaps, and recommend architectural improvements. Lead development and execution of DR/COOP exercises, capture lessons learned, and update playbooks, runbooks, and transition‑to‑operations artifacts. Support outage monitoring, incident coordination, and operational recovery activities; provide technical leadership during degraded operations and failover events. Produce recovery strategies, technical recovery designs, system analysis documentation, and decision‑grade briefings for program leadership. Drive continuous improvement of DR/COOP capabilities through automation, validation testing, and modernization initiatives. #ENOCS

Requirements

  • Minimum of 8 years with BS/BA; Minimum of 6 years with MS/MA; Minimum of 3 years with PhD
  • Clearance: Active TS/SCI clearance.
  • Candidate must meet ONE of the following:
  • Master’s degree or Ph.D. in Computer Science, Software Engineering, Cybersecurity, Data Science, or a related field; OR
  • Relevant DoD/military training (examples: SANS SEC545; SANS LDR512; Joint Cyber Analysis Course (JCAC)); OR
  • Relevant professional certification or equivalent experience (examples: ISC2 CISSP; CompTIA CASP+; AWS Certified DevOps Engineer – Professional; CISM; GIAC GWEB).
  • Required experience and skills:
  • DR/COOP, systems engineering, or resiliency architecture experience with at least 3 years leading enterprise DR/COOP design and exercises.
  • Proven ability to design and validate failover architectures, backup/recovery strategies (including RMAN/DB backups, storage replication, and cloud DR), and end‑to‑end recovery playbooks.
  • Hands‑on experience conducting COOP/DR exercises, failover testing, recovery validation, and scripting/automation for recovery procedures.
  • Strong knowledge of RMF/ATO impacts on continuity, STIG/CSG considerations for recoverable systems, and evidence requirements for accreditation and audits.
  • Experience coordinating cross‑functional recovery activities with SOC/CIRT, NOC, system owners, and external providers during incidents.
  • Ability to produce decision‑grade recovery plans, technical briefs, POA&Ms, and exercise after‑action reports.

Nice To Haves

  • Prior DoD/ARNG COOP/DR or enterprise resilience experience.
  • Experience with cloud failover patterns, multi‑region replication, orchestration of recovery workflows, and automated recovery validation.
  • Familiarity with business continuity planning, service continuity metrics, and integration of DR/COOP with incident response and operational exercises.

Responsibilities

  • Architect, develop, and maintain enterprise disaster recovery (DR) and Continuity of Operations (COOP) capabilities for systems, applications, and cybersecurity platforms.
  • Design COOP plans, backup/recovery procedures, contingency playbooks, failover architectures, and recovery point/time objectives aligned with RMF, DoD, Army, and ARNG directives.
  • Maintain COOP environments, coordinate failover testing and contingency exercises, and validate recovery procedures to ensure continuity of essential services during disruptions.
  • Integrate redundancy, continuity, and recovery requirements into system and network designs to ensure resilience against cyber incidents, infrastructure failures, and mission disruptions.
  • Collaborate with cybersecurity engineering, SOC/CIRT, enterprise architects, and system owners to assess resilience, identify continuity gaps, and recommend architectural improvements.
  • Lead development and execution of DR/COOP exercises, capture lessons learned, and update playbooks, runbooks, and transition‑to‑operations artifacts.
  • Support outage monitoring, incident coordination, and operational recovery activities; provide technical leadership during degraded operations and failover events.
  • Produce recovery strategies, technical recovery designs, system analysis documentation, and decision‑grade briefings for program leadership.
  • Drive continuous improvement of DR/COOP capabilities through automation, validation testing, and modernization initiatives.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service