Manager, Technical Operations Center

Take-Two Interactive Software, Inc.Austin, TX

About The Position

As the Manager of the Technical Operations Center (TOC), you will lead Take-Two's Austin-based and international TOC teams, serving as the primary responders for our global infrastructure. This 24x7 team, composed of first-line, site reliability, and senior engineers is responsible for the initial triage of critical business platforms, public cloud services, and on-premises compute, network, and storage. Your leadership will ensure the constant availability and peak performance of services essential to our worldwide operations. In addition to infrastructure response, the TOC Manager oversees Take-Two's Problem and Change Management practices. You will manage dedicated leads responsible for the change management lifecycle and the Change Advisory Board (CAB) and the effectiveness of our Problem Management practice, focusing on root cause analysis, post-incident reviews, and proactive trend analysis to prevent future service disruptions.

Requirements

  • 5+ years of relevant experience in large-scale production networks or systems operations, with a strong background in site reliability engineering (SRE) principles.
  • Proven experience leading globally distributed, 24x7 operational teams and managing major incident handling.
  • Strong technical background in managing and maintaining global infrastructure, including a firm understanding of:
  • Public Cloud Providers: AWS, GCP, and Azure.
  • Operating Systems: Linux and Windows in production environments.
  • Virtualization: VMware, Proxmox, KVM, or Hyper-V.
  • Infrastructure as Code (IaC) Concepts: Terraform or Ansible.
  • Exceptional communication and collaboration skills, with the ability to clearly articulate technical concepts to both technical staff and senior management.
  • Experience ensuring operational activities are thoroughly documented and compliant with audits.
  • A bias towards automation and optimizing operational workflows for service delivery.

Responsibilities

  • Oversee Take-Two’s global Technical Operations Center (TOC) staff, including front-line first-responder, senior, and site reliability engineers, ensuring 24x7 availability and performance of critical business services.
  • Direct first response and triage efforts for the global infrastructure footprint, encompassing self-hosted compute, storage, network, critical business services, and public cloud infrastructure.
  • Lead the consolidation and streamlining of TOC/NOC staff into regional centers, aligning resources for function and location.
  • Drive continuous operational improvements by implementing standardized communications and a governed, iterative process for creating and updating runbooks, ensuring all new monitoring intake is documented.
  • Improve operational efficiency and incident response, focusing on reduced Mean Time To Resolve (MTTR).
  • Manage L1/L2 safe manual workload execution and coverage, including end-user facing synchronous and asynchronous workloads, including onboarding and post-onboarding provisioning.
  • Refactor and align standardized operations functional roles in collaboration with other teams such as Advanced Operations , Global Support, and Systems Engineering.
  • Develop the TOC into a resident Observability center of excellence and an operational analytics capability supporting internal and external workloads.

Benefits

  • Great Company Culture.
  • Growth
  • Work Hard, Play Hard.
  • Benefits. Medical (HSA & FSA), dental, vision, 401(k) with company match, employee stock purchase plan, commuter benefits, in-house wellness program, broad learning & development opportunities, a charitable giving platform with company match and more!
  • Perks. Fitness allowance, employee discount programs, free games & events and stocked pantries.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service