DevOps Lead

Leidos
1dRemote

About The Position

Leidos is seeking a DevOps Lead to lead and further mature platform engineering, site reliability engineering (SRE), and application security functions within a large-scale software development and IT operations program. This role is responsible for driving operational excellence, security, resiliency, and scalability across enterprise DevOps platforms supporting mission-critical applications for federal government customers. Position Summary: The DevOps Lead will oversee a multidisciplinary team tasked with enabling continuous integration/continuous deployment (CI/CD), optimizing cloud and on-premise platforms, securing development pipelines, and ensuring highly reliable and performant systems. This leader will collaborate with architects, engineering leads, project managers, and customer stakeholders to maintain an automation-first culture and introduce best practices for reliability, security, and DevOps maturity. This opportunity is ideal for an accomplished DevOps leader who thrives on solving complex platform, security, and reliability challenges at scale and is passionate about technical excellence and customer mission achievement. Location: This position may be remote, though periodic visits to customer sites or the Program Office (within Washington, DC region) are required.

Requirements

  • Bachelor’s degree and 10+ years of progressive experience in software development, DevOps, or platform engineering.
  • 3+ years of technical team leadership or management experience.
  • Demonstrated expertise in advanced DevOps practices, including CI/CD, configuration management, automation, and cloud-native operations (AWS, Azure, or similar).
  • Hands-on experience with SRE frameworks, monitoring, logging, alerting, and reliability engineering techniques.
  • Proven background in securing applications and systems, including integrating security into pipelines and coordinating with security/compliance teams.
  • Strong technical knowledge of container orchestration (Kubernetes, Docker), IaC (Terraform, CloudFormation), and end-to-end application/platform lifecycle management.
  • Excellent interpersonal, written, and verbal communication skills.
  • Strong problem-solving skills and ability to thrive in a fast-paced, dynamic environment.
  • U.S. citizenship and ability to obtain and maintain required government security clearance.

Nice To Haves

  • Certifications in cloud platforms (e.g., AWS Certified DevOps Engineer, Azure DevOps), SRE, or security (e.g., CISSP, CISM).
  • Experience managing federal or large enterprise DevOps/SRE/Platform Engineering teams.
  • Familiarity with federal compliance frameworks (FedRAMP, FISMA, NIST 800-53).
  • ITIL Foundations or equivalent IT service management training.
  • Prior experience with infrastructure modernization, large-scale migrations, or complex system integrations.
  • Familiarity with Department of Justice or federal government IT environments.

Responsibilities

  • DevOps & Platform Management: Direct the design, implementation, and maintenance of CI/CD pipelines, automated provisioning, and monitoring processes across cloud and hybrid environments.
  • Lead efforts to standardize and optimize platform engineering practices, adopting Infrastructure-as-Code (IaC) and microservices deployment models.
  • Site Reliability Engineering (SRE): Develop and enforce SRE principles, including release management, system reliability, observability, incident management, SLAs/SLOs, and fault tolerance.
  • Implement monitoring and alerting solutions to proactively identify issues, reduce mean time to resolution (MTTR), and drive service uptime objectives.
  • Application Security: Integrate security throughout the SDLC, ensuring robust code review, vulnerability scanning, and threat modeling within automated pipelines.
  • Collaborate with security teams to remediate vulnerabilities and achieve compliance with industry and federal standards (e.g., FedRAMP, NIST).
  • Team Leadership & Collaboration : Mentor and lead a multidisciplinary team, promoting an agile, collaborative, and innovative work environment.
  • Partner with development, security, and operations teams to align platform strategies with business and mission requirements.
  • Platform Engineering: Champion creation and curation of reusable infrastructure patterns, automation scripts, cloud orchestration templates, and developer self-service platforms.
  • Evaluate emerging tools and technologies for enhancing developer productivity and system resilience.
  • Service Operations & Incident Response: Oversee operational readiness, incident response, root cause analysis, and continuous improvement initiatives to ensure high availability and rapid recovery from service disruptions.
  • Continuous Improvement: Drive a culture of innovation by assessing and implementing advancements in DevOps, platform engineering, and SRE practices.
  • Regularly review system metrics, operational KPIs, and propose enhancements.
  • Reporting & Stakeholder Communication: Prepare and present operational dashboards, incident reports, risk assessments, and status updates to program leadership and customers.
  • Ensure transparent communication of operational posture and improvement initiatives.

Benefits

  • Employment benefits include competitive compensation, Health and Wellness programs, Income Protection, Paid Leave and Retirement.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service