Director - Platform Engineering & Development Operations

FORTNA•Atlanta, GA

10d

About The Position

FORTNA partners with the world’s leading brands to transform omnichannel and parcel distribution operations. Known world-wide for enabling companies to keep pace with digital disruption and growth objectives, we design and deliver solutions, powered by intelligent software, to optimize fast, accurate and cost-effective order fulfillment and last mile delivery. Our people, innovative approach and proprietary algorithms and tools ensure optimal operations design and material and information flow. We deliver exceptional value every day to our customers with comprehensive services and products including network strategy, distribution center operational design and implementation, material handling automated equipment, robotics and a comprehensive suite of lifecycle services. At FORTNA, we believe in fostering a workplace that isn't just a job but a movement – a collective effort to redefine success and transform challenges into opportunities. "Join the Movement" encapsulates our commitment to a workplace culture that thrives on collaboration, celebrates diversity, and empowers every individual to contribute to something greater than themselves. Our Team. Our Passion. Our Approach. The Mission Own, evolve, and operate the platform that powers FORTNA’s mission-critical software. This is a hands-on leadership role. You will personally design, build, and operate RHEL and OpenShift/Kubernetes platforms, establish a secure and scalable GitOps delivery model, and measurably improve reliability, deployment velocity, and recovery outcomes for Tier-1 services. You will set a high operational bar—automated, auditable, and secure by default—while enforcing strong governance, including no direct developer access to production.

Requirements

Bachelor’s degree in Computer Science, Information Systems, or related field; Master’s preferred.
10+ years in IT, with at least 5 years in platform engineering, DevOps, or SRE leadership.
Linux: Recent, production-grade RHEL administration and operations.
Containers/Orchestration: Operating OpenShift (or enterprise Kubernetes) in production: cluster upgrades, operators, networking/ingress, quotas/limits, security contexts, multi-cluster promotion.
GitOps & CD: Operating Argo CD in production (app-of-apps, sync waves/health, drift detection, secrets management; progressive delivery with Argo Rollouts/Flagger or equivalent).
CI & Artifacts: Building/owning GitHub Actions pipelines (reusable workflows, runners) and managing Artifactory (images/charts, immutability, SBOM/attestation, provenance/signing).
Virtualization: Designing/supporting VMware and/or Nutanix environments (HA clusters, templates, storage/network design, backup/restore).
Ops & Governance: Demonstrated no-dev-in-prod controls, incident leadership, SLO/error-budget practice, and DR testing.

Nice To Haves

AWS/EKS operations (VPC/subnets/NAT, IRSA/IAM, add-ons, KMS/Secrets Manager/SSM, cost guardrails).
IaC at scale (Terraform/Ansible), OPA/Conftest, service mesh, Backstage developer portal.
Observability stacks (Prometheus/Grafana/ELK, Datadog/New Relic) and ITSM integration (ServiceNow/Jira).
AI-assisted engineering (CodeQL, Copilot/ChatGPT) with guardrails.
Leadership Skills: Strong people management and team-building capabilities.
Exceptional communication and stakeholder engagement skills.
Ability to bridge technical and business objectives.

Responsibilities

Platform Engineering & Operations Design, build, upgrade, and operate RHEL fleets and OpenShift/Kubernetes clusters across multiple environments and clusters.
Own platform concerns end-to-end: networking, storage, identity, security policies, scaling, and HA/DR.
Drive platform standardization through opinionated, well-documented golden paths.
CI/CD, GitOps & Delivery Architect and operate the paved path: GitHub Actions → Artifactory → Argo CD.
Build reusable GitHub Actions workflows and runners; manage artifact immutability, retention, and provenance.
Operate Argo CD at scale: app-of-apps, sync waves, health checks, drift remediation, and secrets.
Enable progressive delivery patterns where appropriate.
Virtualization & Infrastructure Design and run VMware and/or Nutanix estates, including HA clusters, templates/golden images, micro-segmentation, backup/restore, and capacity planning.
DevSecOps & Automation Embed security by design into build and promotion pipelines, including SAST, DAST, secret scanning, image signing, SBOMs, and policy-as-code gates.
Reliability & SRE Practices Define and operationalize SLIs, SLOs, and error budgets.
Improve deployment frequency and MTTR.
Lead incident response and drive blameless post-mortems.
Own DR objectives and testing cadence.
Governance & Collaboration Enforce no-human-in-prod via RBAC, GitOps controls, and auditable break-glass procedures.
Partner closely with Application Engineering, QA, Security, and Architecture.
Leadership & Enablement Lead, mentor, and grow Platform/DevOps/SRE engineers.
Set quarterly OKRs and platform roadmaps.
Drive adoption through reference architectures and enablement.