Total Wine & More is seeking a Senior DevOps Engineer to join our Technology team in Bethesda, MD or Boca Raton, FL. In this role, you will run Kubernetes on AKS and GKE with a strong focus on security and reliability. You'll use Argo Workflows to orchestrate data pipelines, and scheduled jobs. You'll build, maintain, and improve CI/CD with Jenkins and GitHub Actions. You'll grow Backstage into a practical internal developer platform that empowers engineers with golden paths, reusable templates, and self-service options. You'll operate Kafka, CockroachDB (Postgres-compatible), Couchbase, and Elasticsearch for performance, resilience, and cost-efficiency. You'll help lead observability with Prometheus, Grafana, and Tempo for tracing. You'll collaborate with teams building in C#/.NET, Go, and Node.js, and help drive our AI initiative by identifying practical use cases and integrating tools that deliver measurable impact. You will report to the Sr. Manager, Platform Engineering. You will Own multi-cloud Kubernetes platforms on Azure AKS and Google Cloud GKE—design cluster topology, networking, RBAC, and policies for secure, scalable, cost-efficient operation. Build and evolve the Internal Developer Platform with Backstage—service catalog, golden-path templates, scorecards, and self-service scaffolding to standardize app onboarding. Engineer CI/CD at scale using Jenkins and GitHub Actions—pipelines-as-code, environment promotion, secrets management, and artifact provenance. Orchestrate batch/data workflows with Argo Workflows (not CI/CD)—multi-tenant DAGs, resource quotas, artifact/versioning strategy, and guardrails. Operate and tune stateful services—Kafka, CockroachDB (Postgres), Couchbase, Elasticsearch—including capacity planning, replication, backup/restore, and DR. Establish end-to-end observability—Prometheus metrics, Grafana dashboards, Grafana Tempo tracing, SLOs/error budgets, actionable alerting, and on-call runbooks. Build platform tooling & automation in Go, Node.js, and C#/.NET—CLIs, controllers/operators, APIs, and integrations that improve developer experience. Drive security, compliance, and reliability practices—image/signing & SBOMs, secrets management, network policies, least privilege, cost monitoring, incident response, and postmortems.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed