About The Position

Ensures the resilience, recoverability, and continuity of critical IT services by driving organizational resilience strategies, optimizing recovery architectures, and maintaining full compliance with audit and regulatory standards to safeguard uninterrupted business operations. Job Description: Provides expertise in enterprise backup platforms, disaster recovery infrastructure, and data replication technologies to guarantee that systems and information can be restored effectively during disruptions or large‑scale outages Develops and maintains Disaster Recovery Plans (DRP), executes backup validation and failover exercises, and collaborates closely with Infrastructure, Network, Cloud, Database, and Cybersecurity teams to ensure end‑to-end recoverability. Supports and administer enterprise backup platforms, disaster recovery systems, and data replication technologies to ensure full data protection and recoverability across all IT environments. Monitors disaster recovery (DR) infrastructure, including storage arrays, backup appliances, replication endpoints, and designated recovery sites, ensuring alignment with Business Continuity objectives and RTO/RPO requirements. Coordinates, documents, and perform backup validation, restore tests, disaster recovery failover exercises, and routine resilience assessments. Monitors and maintain accurate documentation of DR procedures, backup policies, recovery workflows, system inventories, and architectural diagrams, ensuring alignment with compliance and audit requirements. Delivers training and guidance to stakeholders to enhance understanding and adoption of best practices. Follows established company policies, procedures, and standards to ensure full compliance with applicable laws and industry regulations. Participates in storm restoration tasks and assigned drills to contribute to the safe and reliable recovery of services. Performs additional tasks aligned with role expectations and qualifications to support team goals and operational flexibility.

Requirements

  • Bachelor’s degree in information technology, Computer Science, Engineering, or a related technical field.
  • 5 years of hands‑on experience in IT infrastructure, with direct responsibility over backup platforms, disaster recovery operations, or data protection technologies.
  • Proven experience managing enterprise‑level backup solutions, storage systems, and replication frameworks across hybrid or multi‑site environments.
  • Demonstrated experience developing, maintaining, and executing Disaster Recovery Plans (DRP), Business Continuity procedures, and failover/runbook testing.
  • Strong background conducting backup validation, restoring testing, and DR resilience assessments in complex enterprise infrastructures.
  • Practical experience automating operational tasks using Shell, Bash, or Python scripting.
  • Experience collaborating with cross‑functional teams including Infrastructure, Cloud, Network, Database, and Cybersecurity to support end‑to-end recoverability.
  • Prior involvement in incident and problem management processes, providing root cause analysis and recovery troubleshooting for backup or DR failures.
  • Experience as matter-expert and asset resource to provide support to internal/external clients.
  • Strong Unix/Linux administration skills, with the ability to manage, troubleshoot, and optimize systems used in backup and disaster recovery operations.
  • Deep understanding of backup technologies, replication methods, data protection best practices, retention policies, and enterprise‑scale DR architectures.
  • Hands‑on expertise with SAN/NAS storage systems, replication frameworks, and backup appliances used in hybrid or multi‑site infrastructures.
  • Proficiency in scripting (Shell, Bash, Python) to automate DR workflows, backup processes, testing routines, and operational tasks.
  • Solid knowledge of virtualization platforms such as VMware, Hyper‑V, Nutanix, and their integration with DR and recovery strategies.
  • Strong analytical and troubleshooting abilities for diagnosing complex infrastructure, performance, or recoverability issues.
  • Ability to work effectively under pressure during outages, DR simulations, or real recovery events, maintaining precision and situational awareness.
  • Excellent communication skills for coordinating with cross‑functional teams and documenting technical processes clearly and accurately.
  • Strong understanding of ITSM practices, including incident, problems, and change management, especially as they relate to recovery operations.
  • High level of organization and attention to detail for maintaining DR documentation, runbooks, inventories, CMDB records, and compliance evidence.

Nice To Haves

  • Preferred professional certifications: – Veeam Certified Engineer (VMCE) Microsoft Azure Administrator or Azure Backup & Recovery (AZ‑104 or AZ‑305) Micro Focus Data Protector certifications (preferred for legacy or transitional environments) Linux certifications such as RHCSA, LFCS, or CompTIA Linux+ ITIL Foundation certification

Responsibilities

  • Provides expertise in enterprise backup platforms, disaster recovery infrastructure, and data replication technologies to guarantee that systems and information can be restored effectively during disruptions or large‑scale outages
  • Develops and maintains Disaster Recovery Plans (DRP), executes backup validation and failover exercises, and collaborates closely with Infrastructure, Network, Cloud, Database, and Cybersecurity teams to ensure end‑to-end recoverability.
  • Supports and administer enterprise backup platforms, disaster recovery systems, and data replication technologies to ensure full data protection and recoverability across all IT environments.
  • Monitors disaster recovery (DR) infrastructure, including storage arrays, backup appliances, replication endpoints, and designated recovery sites, ensuring alignment with Business Continuity objectives and RTO/RPO requirements.
  • Coordinates, documents, and perform backup validation, restore tests, disaster recovery failover exercises, and routine resilience assessments.
  • Monitors and maintain accurate documentation of DR procedures, backup policies, recovery workflows, system inventories, and architectural diagrams, ensuring alignment with compliance and audit requirements.
  • Delivers training and guidance to stakeholders to enhance understanding and adoption of best practices.
  • Follows established company policies, procedures, and standards to ensure full compliance with applicable laws and industry regulations.
  • Participates in storm restoration tasks and assigned drills to contribute to the safe and reliable recovery of services.
  • Performs additional tasks aligned with role expectations and qualifications to support team goals and operational flexibility.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service