Director, Critical Facilities Systems

Fleet Data CentersMercer Island, WA
7h

About The Position

The Director – Critical Facilities Systems owns Fleet’s centralized, 24/7 operational command-and-control functions and the digital systems that power our field execution. This leader is accountable for the Critical Facilities Operations Center (CFOC), the Network Operations Center (NOC), and the team responsible for administration, maintenance, and continuous improvement of Fleet’s operational tools (DCIM/BMS/EPMS, CMMS, ticketing/ITSM, and related platforms). This role is designed to help Fleet deliver near-perfect outcomes in safety, security, and availability by ensuring our operations centers and toolchain are reliable, scalable, well-governed, and tightly integrated with site teams, engineering, construction/commissioning, IT/network engineering, security, and customer teams.

Requirements

  • 10+ years of experience in mission-critical operations (data centers or similar critical infrastructure), including operations center / command center / NOC leadership.
  • 5+ years of people leadership experience, including building or scaling 24/7 shift-based teams (staffing, training, performance management, and accountability).
  • Strong working knowledge of critical facilities operations and telemetry, including BMS/EPMS/SCADA alarming and trends; ability to translate data into sound operational decisions.
  • Working knowledge of network operations concepts (monitoring, triage, escalation, carrier/vendor coordination, and customer communications).
  • Hands-on experience owning and administering operational platforms such as DCIM/BMS, CMMS, and ticketing/ITSM systems; strong discipline in change control and data governance.
  • Demonstrated incident management and root cause analysis skills; calm, clear-eyed execution in high-stakes, time-sensitive events.
  • Strong cross-functional leadership and communication skills; able to align stakeholders across Operations, IT, Network Engineering, Security, Construction/Commissioning, and Customer teams.
  • Willingness and ability to travel to Fleet sites as needed.
  • Integrity and Ethical Standards: Build trust, ensure fairness, and foster long-term, transparent relationships with suppliers.
  • Effective Communication: The ability to clearly convey expectations and requirements to suppliers and negotiation parties, while understanding their needs and concerns. Comfortable delivering written and verbal presentations to internal leadership teams.
  • Emotional Intelligence (EQ): Ability to understand the emotions, cultural nuances, and motivations of others, while effectively managing one's own emotions during high-pressure negotiations.
  • Strategic Thinking: Recognize how supplier relationships and negotiations align with the broader organizational goals, while aiming for outcomes that benefit both parties.
  • Critical Thinking Skills: Finding innovative solutions and being flexible in addressing unexpected challenges.
  • Analytical Ability: Make data-driven decisions, assess cost structures, and identify potential risks, ensuring informed and strategic outcomes.
  • Influence and Persuasion: Able to effectively advocate for their position, build consensus, and secure favorable agreements without compromising relationships.
  • Operational Paranoia: Anticipate risks, identify vulnerabilities, and proactively implement mechanisms to prevent and minimize disruptions and safeguard safety, security, availability, and scale.
  • Relationship Management: Cultivate trust, collaboration, and long-term partnerships, while building a broad network that provides valuable benchmarking, industry insights, and alternative sourcing options.

Responsibilities

  • Safety, security, and availability are the most important things we do. Help Fleet deliver near-perfect execution on these dimensions by building programs that are measurable, enforceable, and continuously improving.
  • Critical Facilities Operations Center (CFOC) Ownership
  • Own the 24/7 CFOC staffing model, training, qualification, and shift-lead structure; build a culture of calm, disciplined execution.
  • Monitor mission-critical facility telemetry (BMS/EPMS/SCADA, DCIM, alarms, trends) and provide first-line triage, ticket creation, and dispatch/escalation to site teams.
  • Maintain and continuously improve response playbooks, escalation paths, and communications protocols (including incident bridges and executive/customer notifications as applicable).
  • Capture high-quality incident timelines and evidence (telemetry snapshots, alarms, trends, logs) and provide an initial technical hypothesis to accelerate root cause analysis.
  • Own alarm strategy governance: thresholds, suppression, correlation, tuning, and reduction of nuisance/false alarms in partnership with engineering and site leaders.
  • Ensure operational readiness of monitoring for new sites and expansions (point lists, alarming, dashboards, runbooks, contacts, and handoff to steady-state operations).
  • Network Operations Center (NOC) Ownership
  • Own the 24/7 NOC staffing, tooling, and procedures to monitor and triage connectivity issues for Fleet and customers.
  • Receive, assess, and route network incidents and service requests; coordinate with internal network engineering, carriers, and vendors to drive rapid restoration.
  • Establish customer-facing communications standards for network incidents (status updates, ETAs, post-incident summaries) in partnership with Customer teams.
  • Maintain a disciplined process for outage tracking, incident documentation, and recurring-issue elimination through problem management.
  • Ensure network monitoring coverage and accuracy (device inventory, alerting, dashboards, and escalation contacts) and support new site/phase turn-ups.
  • Critical Systems & Operational Tools (DCIM/BMS, CMMS, Ticketing, and Related Platforms)
  • Lead the team responsible for day-to-day administration, reliability, and lifecycle management of Fleet’s operational systems: DCIM/BMS/EPMS/SCADA, CMMS, ticketing/ITSM, and supporting reporting/analytics tools.
  • Own user access governance, role-based permissions, auditability, and change control for operational tools (in alignment with Fleet’s security posture and IT controls).
  • Establish data standards and quality controls for asset registries, naming conventions, location hierarchy, alarm taxonomy, work order data, and ticket categorization to enable consistent reporting across sites.
  • Manage vendor relationships, support contracts, SLAs, and roadmaps; translate operational needs into prioritized requirements and drive delivery with partners.
  • Own system upgrades, patches, and enhancements—including testing, release management, training, and communications—to avoid downtime and user disruption.
  • Drive integrations and automation between systems (e.g., alarms-to-tickets, CMMS-to-asset registry, dashboards/BI) to reduce manual work and increase response quality.
  • Incident Support, Analytics, and Continuous Improvement
  • Define and report KPIs for operations center performance and tool health (e.g., MTTA/MTTR, dispatch time, alarm volume and quality, ticket cycle times, tool uptime, and network SLOs).
  • Partner with site leaders and engineering to drive post-incident reviews, corrective actions, and recurring-issue reduction; ensure actions are tracked to closure.
  • Identify systemic process or tooling gaps and build business cases for improvement, automation, and reliability enhancements.
  • Support audits and compliance needs by ensuring operational data, logs, and evidence are retained, accessible, and consistent.
  • Provide triage and support to site teams during events, be their eyes and ears, and own timely and accurate communications

Benefits

  • Fleet Data Center employees enjoy competitive compensation and comprehensive benefits, including 100% employer-covered medical, dental, and vision insurance, a 401K program, standard paid holidays, and unlimited PTO.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service