About The Position

Join NVIDIA’s modern team as a Data Center Operations Controls Engineer. This remarkable opportunity allows you to define and manage operational readiness, support, and governance of Cronus, our leading monitoring and control platform. At NVIDIA, you will be part of an innovative environment, partnering with data center teams, Engineering, and the FOC to achieve a smooth technical roadmap. This role fits those driven to build a meaningful difference with our newest technology!

Requirements

  • 12+ years of experience in operations, controls, or monitoring systems in data center, industrial, or large-scale infrastructure environments.
  • B.S. in related field or equivalent experience.
  • Strong understanding of controls systems, monitoring platforms, or SCADA-like tools, including alarms, setpoints, and configuration management.
  • Proven success partnering with engineering, operations, and vendor teams to stabilize and improve technical platforms.
  • Excellent communication skills, with the ability to translate technical issues into clear operational actions for frontline teams.
  • Track record of defining and using operational metrics to drive performance and reliability improvements.

Nice To Haves

  • Experience in managing data center operations or critical facilities.
  • Experience with Ignition control systems.
  • Background in process control, industrial automation, or building management systems.
  • Experience leading integrations between monitoring platforms and other infrastructure tools.
  • Experience with change process oversight, incident response, and configuration control approaches.

Responsibilities

  • Work together with Controls engineering to prioritize and coordinate the resolution of critical UI, stability, and interoperability issues affecting data center operations.
  • Lead operational cleanup at live sites, including nuisance alarm reduction, disabled point remediation, and restoration of a usable monitoring baseline.
  • Collaborate with engineering and operations to establish and uphold a consistent Controls version and configuration baseline, including setpoints, thresholds, and alarm defaults.
  • Help establish naming standards, topology mapping methods, and configuration governance to ensure consistency across sites.
  • Own the development and delivery of training, documentation, and knowledge transfer for data center operators and FOC teams using the Controls system.
  • Support the planning and rollout of integrations between Controls and key infrastructure tools (asset, power, and monitoring systems), focusing on operational value and adoption.
  • Define and track key operational metrics, such as incident response times, alarm quality, and configuration compliance, and drive continuous improvement.

Benefits

  • equity
  • benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service