Lead Platform Engineer

Total Wine & More
4d

About The Position

About the Role Wine & More is seeking a Lead Platform Engineer to join our Technology team in our Bethesda, MD office. You'll own Kubernetes on AKS and GKE with a strong focus on security, reliability, and cost efficiency—establishing standards and guardrails. You'll use Argo Workflows to orchestrate data pipelines and scheduled jobs at scale. You'll lead the design, build, and continuous improvement of CI/CD with Jenkins and GitHub Actions. You'll grow Backstage into a practical internal developer platform that empowers engineers with golden paths, reusable templates, and self-service options. You'll operate Kafka, CockroachDB (Postgres-compatible), Couchbase, and Elasticsearch for performance, resilience, and spend efficiency. You'll raise the bar on observability with Prometheus, Grafana, and Tempo for tracing—defining SLOs and actionable alerting. You'll mentor engineers and partner with product, Security teams building in C#/.NET, Go, and Node.js, while leading our AI initiative by identifying practical use cases and integrating tools that deliver measurable impact. You will report to the Sr. Manager, Platform Engineering.

Requirements

  • Bachelor's Degree preferred or equivalent years of experience, 5-8 years preferred.
  • Multi-cloud Kubernetes ops (AKS/GKE): topology, RBAC, policies, upgrades, cost control.
  • CI/CD (Jenkins, GitHub Actions): pipelines-as-code, promotions, rollbacks, artifact hygiene.
  • Backstage IDP: service catalog, scaffolder templates, golden paths, scorecards/governance.
  • Argo Workflows (not CI/CD): DAG design, multi-tenant guardrails, quotas, artifact/versioning.
  • Distributed data ops: Kafka, CockroachDB, Couchbase, Elasticsearch—tuning, backup/restore, DR.
  • Observability: Prometheus, Grafana, Tempo—SLO/SLI design, alerting, actionable runbooks.
  • Platform Dev: Go, Node.js, C#/.NET.

Responsibilities

  • Manage multi-cloud Kubernetes operations on Azure AKS and Google Cloud GKE—cluster topology, networking, identity/RBAC, policies, upgrades, and cost controls.
  • Support CI/CD engineering with Jenkins and GitHub Actions—pipelines-as-code, environment promotion, secrets management, artifact provenance, and rollback strategies.
  • Manage IDP patterns using Backstage—Service Catalog, TechDocs, Scaffolder templates, and scorecards to define golden paths and governance.
  • Support in workflow orchestration with Argo Workflows (explicitly not for CI/CD)—DAG design, artifact/versioning, resource quotas, and multi-tenant isolation.
  • Knowledge of distributed data systems—operating Kafka, CockroachDB (Postgres), Couchbase, and Elasticsearch including replication, performance tuning, backup/restore, and DR.
  • Support in observability engineering with Prometheus, Grafana, and Grafana Tempo—SLO/SLI design, alert tuning, tracing, and actionable dashboards/runbooks.
  • Work in platform software development using Go, Node.js, and C#/.NET—building CLIs, operators/controllers, APIs, and automation that harden and simplify the platform.

Benefits

  • Paid Time Off (PTO)
  • Generous store discounts
  • Health care plans (medical, prescription, dental, vision)
  • 401(k), HSA, FSA, Pre-tax commuter benefits
  • Disability & life insurance coverage
  • Paid parental leave
  • Pet insurance
  • Critical illness and accident insurance
  • Discounted home and auto insurance
  • College tuition assistance
  • Career development & product training
  • Consumer classes & More!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service