Director of DevOps

FindHerndon, VA
Onsite

About The Position

The Director of DevOps sets the strategy and runs the day-to-day for Exostar’s global, 24x7 production operations. The role is the technical and operational backbone for every customer-impacting issue: it owns the engineering response that Customer Support depends on, partners closely with Support on customer outcomes, and is the single accountable leader on the engineering side of every Sev-1/Sev-2. This is a highly visible role in a fast-growing, fast-moving company — every dollar of cloud and tooling spend gets reported up through this leader, and every step we take toward an AI-native operations posture is driven from this seat. The successful candidate is an automation-first technologist, a cross-organizational operator who is equally credible with engineering, product, security, support, and finance, and a process-minded leader who treats manual toil as a defect and uses AI as the default tool to eliminate it.

Requirements

  • 10+ years running production technical operations for a customer-facing SaaS business with hard uptime SLAs.
  • 5+ years leading multi-disciplinary teams of 25+ across DevOps/SRE and IT.
  • Deep, hands-on technical depth: cloud (AWS and/or Azure), Kubernetes, infrastructure-as-code (Terraform/Bicep), CI/CD, observability stacks, secrets and identity.
  • Demonstrated bias toward automation over headcount — with specific examples of toil eliminated, deploy frequency increased, or incident volume reduced through engineering.
  • Track record of using AI in production: LLM-based assistants, agentic workflows, AI-driven observability — not just experimenting with it.
  • Strong financial fluency: built and defended a multi-million-dollar infrastructure budget, owned COGS, partnered with a CFO/Finance organization on unit economics.
  • Demonstrated success partnering with Customer Support, Customer Success, or equivalent functions on joint operating models, incident handoffs, and shared customer-experience metrics.
  • Strong customer-facing presence: composed on a Sev-1 bridge with a CISO and credible in a QBR with a Fortune 100 customer.
  • Excellent written and verbal communication. This person writes the postmortem, briefs senior leadership, and explains the COGS variance — clearly.
  • Process discipline and detail orientation: reads the runbook, audits the dashboard, and notices the inconsistency.
  • Experience operating in regulated environments — SOC 2, NIST 800-171, ITAR, CMMC, or FedRAMP.
  • U.S. Citizens only- Due to customer requirements, U.S. Citizenship is required.
  • Ability to gain and maintain Trusted Role is required

Nice To Haves

  • Direct experience standing up an AIOps program or deploying AI agents inside a production operations function.
  • Experience in defense supply chain, aerospace, or other regulated industries.
  • Familiarity with FedRAMP Moderate, CMMC Level 2, and NIST 800-53 / 800-171 Rev 2 control families.
  • Experience with PKI, HSMs, and identity-bound infrastructure.
  • Career arc that includes both “build” (engineering org) and “run” (operations org) leadership.
  • Bachelor’s degree in a technical discipline.
  • Advanced degree preferred.

Responsibilities

  • Own production uptime, performance, and reliability across all Exostar SaaS products on a 24x7 basis, including SLO definition, on-call rotation, incident command, and stakeholder communications during major incidents.
  • Own the engineering and operational response to every customer-impacting issue.
  • Partner with Customer Support on triage, root cause, communications, and resolution — Support owns the customer relationship; this role owns the technical fix and the systemic prevention.
  • Drive blameless postmortems and ensure every Sev-1/Sev-2 results in a durable engineering or process fix — not a “we’ll watch for it next time.”
  • Be the single accountable engineering leader when a customer asks “what happened, and what are you doing about it.”
  • Treat manual operational work as a defect. Set and enforce a target for the percentage of operational toil eliminated each quarter.
  • Drive infrastructure-as-code, GitOps, automated remediation, and self-healing patterns across the production estate.
  • Build a deployment platform that lets engineering teams ship safely and frequently without DevOps as a bottleneck.
  • Own CI/CD pipeline strategy, golden paths, and the developer experience for shipping to production.
  • Stand up and continuously evolve an AIOps practice: AI-driven anomaly detection, log summarization, intelligent alerting, and agentic incident triage.
  • Deploy AI agents to draft runbooks, post first-pass postmortems, and accelerate engineering investigation of customer-reported issues.
  • Mine operational and incident data with AI for recurring failure modes and capacity drift and turn those into engineering bets.
  • Operate as a peer to engineering, product, security, customer support, and finance leaders. This role lives at the intersection of those functions and has to be effective in all of them.
  • Partner with Customer Support on the joint operating model: incident handoffs, ticket-to-engineering workflows, status communications, and shared metrics for customer experience during issues.
  • Partner with Product Management and Finance on launch-readiness, capacity planning, and pricing/COGS modeling for new and existing services.
  • Partner with the Security Office on compliance, audit readiness, and secure-by-default infrastructure (SOC 2, NIST 800-171, CMMC, FedRAMP-adjacent).
  • Represent Operations in customer escalations, audit conversations, and revenue-impacting deals.
  • Own the cost-of-goods line for Exostar’s hosted services. Forecast, track, and explain it monthly to Finance and senior leadership.
  • Drive cloud cost optimization (commitments, right-sizing, idle elimination, architectural efficiency) as an ongoing discipline, not an annual project.
  • Build the unit-economics views Finance needs to run the business: cost per customer, per product, per environment.
  • Own vendor relationships and contracts for infrastructure, observability, and managed services; lead RFPs and renewals.
  • Stand up and maintain the operating cadence: weekly ops reviews, monthly business reviews, quarterly capacity planning, incident review boards (jointly with Customer Support and Engineering leadership).
  • Define and report KPIs and KRIs the CTO and CFO can use to run the business: availability, MTTR, deploy frequency, COGS per unit, automation coverage, and engineering-side metrics on customer-impacting issues.
  • Maintain the system of record for production inventory, dependencies, and configuration.
  • Own the disaster recovery and business continuity program — including the drills, not just the plans.
  • Lead, coach, and grow the DevOps team. Build a culture where engineers default to automation and the whole team takes pride in customer outcomes — even when the customer relationship is held by a partner team.
  • Hire for automation instinct, technical depth, and cross-functional collaboration.
  • Performance-manage against an automation and AI-leverage bar, not headcount growth.

Benefits

  • employee development
  • promote internally
  • training and educational assistance
  • fun, engaged workplace, with social and community-building events
  • comprehensive benefits
  • flexible time off plans
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service