About The Position

GitLab is the intelligent orchestration platform for DevSecOps. GitLab enables organizations to increase developer productivity, improve operational efficiency, reduce security and compliance risk, and accelerate digital transformation. More than 50 million registered users and more than 50% of the Fortune 100 trust GitLab to ship better, more secure software faster. The same principles built into our products are reflected in how our team works: we embrace AI as a core productivity multiplier, with all team members expected to incorporate AI into their daily workflows to drive efficiency, innovation, and impact. GitLab is where careers accelerate, innovation flourishes, and every voice is valued. Our high-performance culture is driven by our values and continuous knowledge exchange, enabling our team members to reach their full potential while collaborating with industry leaders to solve complex problems. Co-create the future with us as we build technology that transforms how the world develops software. Fortune 500® is a registered trademark of Fortune Media IP Limited, used under license. Claim based on GitLab data. Fortune 100 refers to the top 20% ranked companies in the 2025 Fortune 500 list, published in June 2025. Fortune and Fortune Media IP Limited are not affiliated with, and do not endorse products or services of GitLab. An overview of this role As a Senior Engineer on the Runway team, you'll lead the design, evolution, and operation of the Kubernetes-based platform and developer tooling that powers GitLab's engineering organization. You'll drive strategic infrastructure initiatives across platform architecture, automation, and developer experience. That includes operating production Kubernetes clusters across cloud environments, scaling our ArgoCD-based GitOps workflows, and setting infrastructure-as-code practices and standards across teams. You'll mentor engineers, influence architectural decisions, and drive platform improvements that enhance reliability, observability, and security controls like RBAC and secrets management. Your work will establish clear patterns that make it easier for application teams to adopt modern practices and ship with confidence. Some examples of our projects: Evolve ArgoCD GitOps standards across environments (Application Sets, sync policies, and deployment guardrails) Build reusable Terraform modules and practices for safe, repeatable cloud infrastructure provisioning and drift detection

Requirements

  • Experience operating and evolving production Kubernetes clusters (upgrades, scaling, disaster recovery, reliability) across one or more cloud environments (for example, Amazon EKS, Google GKE, or Azure AKS).
  • Experience designing and running GitOps-based continuous delivery workflows with ArgoCD, Flux, or similar tools; able to establish and maintain deployment standards across environments.
  • Experience with infrastructure as code (Terraform or equivalent), including reusable modules, state management, and drift detection practices for safe infrastructure provisioning.
  • Ability to write and maintain automation using a scripting language (for example, Python, Bash, or Go) and guide others on best practices.
  • Working knowledge of networking fundamentals (DNS, load balancing, ingress) and related platform patterns (for example, service mesh) to design reliable network architectures.
  • Strong written and verbal communication skills, including mentoring, writing clear system documentation, and establishing runbooks and best practices across teams.

Responsibilities

  • Lead the operation and evolution of production-grade Kubernetes clusters across cloud environments, making architectural decisions on upgrades, scaling, disaster recovery, and reliability improvements that impact the entire organization.
  • Define and drive GitOps strategy and standards across the organization, owning ArgoCD-based workflows by architecting Application Sets, sync policies, and deployment standards, and mentoring teams on GitOps best practices.
  • Architect and establish Terraform-based infrastructure-as-code standards across teams, building reusable modules and practices that enable safe, scalable cloud infrastructure provisioning while establishing clear patterns for state management and drift detection.
  • Lead platform observability strategy and incident response processes, set standards for monitoring and post-incident reviews, and drive organization-wide improvements to availability, performance, and resilience.
  • Partner with and mentor application teams to onboard services onto the platform, establishing patterns for documentation, runbooks, and self-service tooling that scale across the organization and improve developer productivity.
  • Design and establish security control standards such as role-based access control (RBAC), network policies, and secrets management (for example, Vault, Sealed Secrets, or External Secrets Operator) that meet compliance requirements and scale across the organization.
  • Drive integration of platform capabilities with continuous integration pipelines (for example, GitHub Actions, GitLab CI, or Tekton) to establish end-to-end delivery workflows that set standards across the organization.

Benefits

  • Benefits to support your health, finances, and well-being
  • Flexible Paid Time Off
  • Team Member Resource Groups
  • Equity Compensation & Employee Stock Purchase Plan
  • Growth and Development Fund
  • Parental leave
  • Home office support

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

501-1,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service