Systems Operations Manager – Data Platforms -Teradata & Hadoop

Wells Fargo & CompanyIrving, TX
Hybrid

About The Position

Wells Fargo is seeking a Systems Operations Manager to lead the end-to-end support and operations of enterprise Teradata and Hadoop data platforms powering large-scale analytics and business decisioning. This role is accountable for platform stability, reliability, and operational excellence across a complex, multi-tenant ecosystem supporting 100+ tenants. The manager will lead a 24x7 operations team, apply Site Reliability Engineering (SRE) principles, and drive automation-led transformation to ensure predictable, resilient service delivery at scale. This is a hands-on leadership role requiring strong execution discipline, ownership, and the ability to operate in a high-risk, regulated environment, ensuring SLA adherence, compliance, and business continuity outcomes.

Requirements

  • 5+ years of Systems Engineering, and Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 2+ years of Leadership experience
  • Hands-on experience with: Teradata and Hadoop platforms
  • Hands-on experience with: Distributed systems and data platform operations
  • Hands-on experience with: Incident, problem, and change management processes

Nice To Haves

  • Experience supporting enterprise-scale Teradata and Hadoop platforms
  • Demonstrated leadership in 24x7 production support and SRE environments
  • Strong experience in: Automation, AIOps, and operational transformation
  • Strong experience in: DevSecOps and CI/CD practices
  • Strong experience in: Observability, monitoring, and platform telemetry
  • Familiarity with Kubernetes, containerization, and cloud-native architectures
  • Strong understanding of: Multi-tenant data platforms and workload management
  • Strong understanding of: Regulatory, audit, and risk-controlled environments

Responsibilities

  • Lead end-to-end platform operations for Teradata and Hadoop environments, ensuring availability, performance, and resilience
  • Provide clear ownership and accountability for production services, operational outcomes, and service stability
  • Drive incident, problem, and change management, including major incident command and recovery leadership
  • Lead 24x7 global support operations, including on-call governance and escalation management
  • Apply SRE principles to improve reliability, availability, and scalability of data platforms
  • Drive automation-first operations to eliminate manual toil and standardize service delivery
  • Implement and enhance observability, monitoring, and self-service capabilities
  • Partner with engineering teams to improve platform reliability, operability, and service maturity
  • Drive adoption of automation, observability, and AIOps practices to reduce manual toil and improve MTTR
  • Own and drive SLA/OLA adherence, uptime, and service health metrics
  • Lead capacity management, performance tuning, and proactive issue prevention initiatives
  • Establish and enforce operational standards, runbooks, and service management practices
  • Drive root cause analysis (RCA) and long-term remediation of systemic issues
  • Ensure alignment with enterprise risk, compliance, and change management frameworks
  • Drive patching, vulnerability remediation, and platform security posture
  • Maintain audit readiness, documentation quality, and control adherence
  • Identify, escalate, and mitigate operational and platform risks
  • Manage operations across shared, multi-tenant platforms, ensuring workload isolation and stability
  • Oversee resource allocation, scheduler configuration, and workload prioritization
  • Execute in high-risk production environments where changes impact multiple tenants simultaneously
  • Partner with Engineering, CIO-aligned teams, Cybersecurity, and LOB stakeholders
  • Provide clear, executive-ready communication on platform health, risks, and priorities
  • Drive cross-functional accountability and execution discipline across teams
  • Lead, coach, and develop a team of Systems Operations engineers and analysts
  • Build a culture of ownership, accountability, and operational excellence
  • Manage resource allocation, workforce planning, and vendor/partner support
  • Develop team capabilities in SRE practices, automation, and platform operations maturity
  • Ensure resiliency posture across Teradata and Hadoop platforms, including: Disaster recovery (DR) readiness and execution, RTO/RPO alignment and validation, Continuous improvement of recovery capabilities
  • Lead BCP execution and failover coordination for critical platforms

Benefits

  • Wells Fargo is an equal opportunity employer.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service