Site Reliability Engineer / DevSecOps

CACI InternationalAberdeen Proving Ground, MD
1d

About The Position

Exciting opportunity to be a lead designing, building, and maintaining secure, scalable, and resilient CI/CD pipelines to fully automate the delivery of software for a US Army client.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering or similar.
  • 7 years of professional experience with at least 5 years of experience in software development, cloud engineering, or systems administration
  • Active DoW Secret clearance and ability to retain that clearance
  • -Extensive experience designing, building, and maintaining automated DevSecOps pipelines using tools like Azure DevOps, Jenkins, of GitLab CI
  • Strong experience with automation and Infrastructure as Code (IaC) tools such as Terraform, Ansible, Python, and PowerShell
  • Understanding of system observability, including experience with the collection and analysis of telemetry and logs to ensure system reliability
  • Proficiency with containerization and orchestration technologies, particularly Kubernetes.
  • Experience in implementing and managing monitoring and alerting solutions, such as Prometheus, Grafana, and/or Azure Monitor.
  • Strong understanding of security principles and experience integrating security practices into the CI/CD pipeline.

Nice To Haves

  • Master's Degree in Computer Science, Computer Engineering or similar with at least 5 years' of experience in software development, cloud engineering, or Site Reliability Engineering
  • Certifications in Site Reliability Engineering, Kubernetes Administration, and/or Azure DevOps Engineering are highly desired.

Responsibilities

  • Design, build, and maintain secure, scalable, and resilient CI/CD pipelines to fully automate the delivery of software
  • Collaborate with development and cybersecurity teams to embed security best practices and automated testing throughout the software development lifecycle.
  • Implement and manage robust monitoring, logging, and telemetry solutions to proactively identify and resolve issues.
  • Automate infrastructure provisioning, configuration management, and application deployments to ensure consistency and reliability.
  • Provide subject matter expertise on reliability and scalability, working with teams to architect resilient and performant systems.
  • Troubleshoot complex production incidents, conduct root cause analysis, and implement preventative measures.
  • Optimize system performance, reliability, and cost-efficiency through continuous improvement and automation.
  • Provide guidance and mentorship to junior team members, fostering a culture of automation, reliability, and operational excellence.

Benefits

  • healthcare
  • wellness
  • financial
  • retirement
  • family support
  • continuing education
  • time off benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service