Manager of DevOps Engineering

Arch Capital GroupRaleigh, NC
Hybrid

About The Position

With a company culture rooted in collaboration, expertise and innovation, we aim to promote progress and inspire our clients, employees, investors and communities to achieve their greatest potential. Our work is the catalyst that helps others achieve their goals. In short, We Enable Possibility℠. The Manager of DevOps Engineering is responsible for leading the design, implementation, and operational excellence of enterprise-scale CI/CD systems, infrastructure automation, and engineering. This role requires deep expertise in modern DevOps tooling, distributed systems, and cloud-native architectures, while also providing technical leadership to a team of DevOps Engineers This individual will drive the evolution of our DevOps practices, ensuring automation-first delivery pipelines, hardened infrastructure, and highly available services that underpin mission-critical business applications.

Requirements

  • Proven expertise with cloud platforms (Azure and AWS required).
  • Strong hands-on experience with Kubernetes, service mesh, and containerized deployments at scale.
  • Deep knowledge of Infrastructure-as-Code (Terraform, Terragrunt) and configuration management.
  • Proficiency in scripting/programming (Python, Go, Bash, PowerShell) for automation and tooling.
  • Advanced understanding of CI/CD best practices, including GitOps workflows and progressive delivery (canary, blue/green).
  • Familiarity with networking (VPC design, ingress/egress, load balancing, CNI plugins).
  • Strong foundation in distributed systems, scaling, fault tolerance, and reliability engineering.
  • Experience with observability stacks (Prometheus, Grafana, ELK, Dynatrace) and incident management.
  • Understanding of modern security controls, secrets management, and compliance frameworks.

Responsibilities

  • Own and scale enterprise-wide CI/CD pipelines using modern orchestration tools (e.g., GitHub Actions, CI, ArgoCD).
  • Architect developer self-service platforms with Infrastructure-as-Code (IaC)
  • Implement role-based access controls (RBAC) across Kubernetes, cloud IAM, and toolchains to ensure compliance and security.
  • Build extensible automation frameworks enabling teams to provision, deploy, and monitor workloads with minimal friction.
  • Evaluate and integrate next-generation CI/CD features such as ephemeral environments, policy-as-code enforcement, and test environment provisioning on demand.
  • Establish and govern standardization of base container images, Helm charts, and deployment templates to promote consistency and reduce security drift across development teams.
  • Manage cloud-native infrastructure (Azure and AWS) with a focus on resiliency, scalability, and cost optimization as it pertains to product workloads.
  • Lead adoption of Kubernetes and container orchestration platforms with advanced configuration (e.g., service mesh, Cilium, Calico, OPA/Gatekeeper).
  • Standardize configuration management using Terraform, Terragrunt, or ArgoCD “Helm”, and integrate with CI/CD pipelines for immutable deployments.
  • Optimize cloud spend and resource utilization by implementing advanced autoscaling strategies, rightsizing recommendations, and reserved instance/savings plan management using FinOps best practices.
  • Define, measure, and enforce Service Level Objectives (SLOs) and Service Level Agreements (SLAs) for platforms and services.
  • Establish observability practices through metrics, distributed tracing, and logging using tools such as Prometheus, Grafana, ELK/EFK, and Dyantrace.
  • Drive proactive capacity management, chaos testing, and resilience engineering to validate system recovery under failure scenarios.
  • Advance the maturity of AIOps initiatives, leveraging machine learning techniques on telemetry data to predict and preempt potential service degradation.
  • Integrate DevSecOps practices into pipelines (e.g., SysDig, Artifactory X-Ray, dependency scanning, container image hardening).
  • Enforce least-privilege principles and manage secrets with tools like HashiCorp Vault, AWS Secrets Manager, Azure Vault, or Kubernetes Secrets.
  • Ensure compliance with regulatory requirements and organization requirements.
  • Manage secrets rotation, key generation, and Public Key Infrastructure (PKI) at scale, ensuring cryptographic best practices are applied across all environments.
  • Architect and validate multi-region, multi-cloud disaster recovery strategies with automated failover testing.
  • Design recovery procedures to minimize RTO/RPO and validate through game-day exercises.
  • Document and evangelize clear runbooks and incident response plans for all major infrastructure platforms, supporting a 24/7 on-call rotation.
  • Develop and automate failover testing for all distributed systems, ensuring minimal impact during simulated regional outages.
  • Lead, mentor, and grow a high-performing DevOps team with a focus on engineering excellence and automation-first culture.
  • Translate business requirements into technical roadmaps for DevOps platforms and reliability engineering initiatives.
  • Collaborate with software engineering, security, and product leadership to align DevOps strategy with enterprise goals.
  • Partner with vendors and service providers to evaluate, implement, and optimize third-party DevOps tooling.

Benefits

  • Multiple medical plans plus dental, vision and prescription drug coverage
  • A competitive 401k with generous matching
  • PTO beginning at 20 days per year
  • Up to 12 paid company holidays per year
  • 2 paid days of Volunteer Time Offer
  • Basic Life and AD&D Insurance
  • Short and Long-Term Disability
  • Paid Parental Leave of up to 10 weeks
  • Student Loan Assistance
  • Tuition Reimbursement
  • Backup Child and Elder Care
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service