Infrastructure SRE & Cloud Platform Engineering Lead

Wells FargoIselin, NJ
1d$159,000 - $305,000Onsite

About The Position

About this role: Wells Fargo is seeking a... In this role, you will: Lead the strategy, design, and resolution of the most complex and high‑impact infrastructure, cloud, and platform challenges across Technology to drive reliability, scalability, and modernization. Support technology and application teams by engineering reliable, secure, and automated cloud platforms that provide efficient, timely, and highly available services to business partners and internal teams. Drive site reliability engineering (SRE) practices, including defining and managing SLIs, SLOs, SLAs, error budgets, incident response, and blameless post‑incident reviews. Ensure service quality, resiliency, security, and cost effectiveness of cloud and platform solutions across AWS, Azure, and OpenShift (OCP) environments. Review customer and platform requirements with engineering teams and implement highly complex infrastructure and platform initiatives impacting multiple lines of business or the enterprise. Lead the adoption of GitOps and Infrastructure‑as‑Code (IaC) practices using tools such as Terraform, Helm, Argo CD, Flux, and CI/CD pipelines to standardize deployments and reduce configuration drift. Drive systematic TOIL reduction by automating repeatable operational tasks, eliminating manual processes, and building self‑service platform capabilities. Embed DevSecOps controls into cloud and platform pipelines, including policy‑as‑code, secrets management, image scanning, and compliance automation. Understand and ensure risk management, security, and compliance requirements for supported platforms and partner with security, risk, and audit teams to implement key initiatives. Make decisions in complex, ambiguous, and multifaceted situations involving multiple concurrent infrastructure, cloud, and platform initiatives. Collaborate and consult with technology teams and senior leadership to resolve infrastructure and reliability issues and deliver optimal cloud‑native solutions. Provide technical leadership and guidance on leveraging new cloud, container, and platform technologies aligned with enterprise standards. Manage, coach, and develop a team or teams of experienced engineers and engineering managers in roles with moderate complexity and risk, responsible for building high quality capabilities with modern technology Ensure adherence to the Banking Platform Architecture, and meeting non-functional requirements with each release Partner with, engage and influence architects and experienced engineers to incorporate Wells Fargo Technology technical strategies, while understanding next generation domain architecture and enable application migration paths to target architecture; for example cloud readiness, application modernization, data strategy Function as the technical representative for the product during cross-team collaborative efforts and planning Identify and recommend opportunities for driving escalated resolution of technology roadblocks including code, build and deployment while also managing overall software development cycle and security standards Determine appropriate strategy and actions to act as an escalation partner for scrum masters and the teams to meet moderate to high risk deliverables and help remove impediments, obstacles, and friction while encouraging constant learning, experimentation, and continual improvement Build engineering skills side-by-side in the codebase, conduct peer reviews to evaluate quality and solution alignment to technical direction, and guide design, as needed Interpret, develop and ensure security, stability, and scalability within functions of technology with moderate complexity, as well as identify, manage and mitigate technology and enterprise risk Collaborate with, partner with and influence Product Managers/Product Owners to drive user satisfaction, influence technology requirements and priorities in the product roadmap, promote innovative and intelligent solutions, generate corporate value and articulate technical strategy while being a solid advocate of agile and DevOps practices Interact directly with third party vendors and technology service providers Manage allocation of people and financial resources to ensure commitments are met and align with strategic objectives in technology engineering Hire, build and guide a culture of talent development to have the skills required to effectively design and deliver innovative solutions for product areas and products to meet business objectives and strategy, as well as conduct performance management for engineers and managers

Requirements

  • 12+ years of experience in Infrastructure Engineering, Site Reliability Engineering (SRE), Cloud Operations, or Platform Engineering, or equivalent demonstrated through work experience, training, military experience, or education.
  • 8+ years of management or leadership experience
  • Experience in Software Engineering, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • Management or leadership experience

Nice To Haves

  • Experience operating within a large-scale Technology organization supporting enterprise infrastructure and cloud platforms.
  • Hands‑on experience with public cloud platforms (AWS and/or Azure) and Kubernetes‑based platforms, including OpenShift (OCP).
  • Experience collaborating with senior leadership, architects, and subject matter experts to drive enterprise‑wide cloud and platform initiatives.
  • Strong bias for action with proven ability to drive cloud and reliability initiatives from concept through execution.
  • Ability to assess cross‑functional and systemic impact across multiple stakeholders while delivering key platform initiatives.
  • Strong program and project management skills, with the ability to manage multiple infrastructure and cloud initiatives simultaneously.
  • Excellent communication and presentation skills, with the ability to explain complex technical concepts to executive and non‑technical audiences.
  • Strong critical thinking, systems analysis, and troubleshooting skills, particularly in high‑availability and distributed systems.
  • Experience leading and influencing virtual and geographically dispersed engineering teams.
  • Advanced user of MS Teams, including establishing and managing channels for operational coordination.
  • Advanced Microsoft Office skills (Word, Excel, Outlook, PowerPoint, Project), with the ability to tell an executive‑level story using visuals and metrics.
  • Ability to organize and manage multiple infrastructure, cloud, and reliability priorities concurrently.
  • Strong analytical skills with high attention to operational metrics, reliability indicators, and system behavior.
  • Ability to comprehend, analyze, and interpret technical documentation, architectures, and operational data to identify critical risks and opportunities.
  • Proven ability to understand the needs of diverse audiences, including engineers, security teams, and business stakeholders.
  • Ability to define expected reliability, performance, availability, and cost outcomes and deliver on commitments.
  • Works independently while maintaining strong communication and transparency with management.
  • Demonstrated ability to drive and lead organizational change, particularly cloud adoption and SRE transformations.
  • Ability to assess incidents and systemic issues, make rapid decisions, and implement corrective and preventive actions.
  • Ability to turn ambiguous or incomplete requirements into well‑defined cloud and platform solutions.
  • Excellent verbal, written, and interpersonal communication skills.
  • Outstanding problem‑solving and decision‑making skills under pressure.
  • Advanced organizational and project management skills; ability to juggle multiple priorities in a dynamic, 24x7 operational environment.
  • Ability to work outside of regular business hours and participate in on‑call rotations as required.
  • Willingness to work on‑site at the stated location for the job opening.

Responsibilities

  • Lead the strategy, design, and resolution of the most complex and high‑impact infrastructure, cloud, and platform challenges across Technology to drive reliability, scalability, and modernization.
  • Support technology and application teams by engineering reliable, secure, and automated cloud platforms that provide efficient, timely, and highly available services to business partners and internal teams.
  • Drive site reliability engineering (SRE) practices, including defining and managing SLIs, SLOs, SLAs, error budgets, incident response, and blameless post‑incident reviews.
  • Ensure service quality, resiliency, security, and cost effectiveness of cloud and platform solutions across AWS, Azure, and OpenShift (OCP) environments.
  • Review customer and platform requirements with engineering teams and implement highly complex infrastructure and platform initiatives impacting multiple lines of business or the enterprise.
  • Lead the adoption of GitOps and Infrastructure‑as‑Code (IaC) practices using tools such as Terraform, Helm, Argo CD, Flux, and CI/CD pipelines to standardize deployments and reduce configuration drift.
  • Drive systematic TOIL reduction by automating repeatable operational tasks, eliminating manual processes, and building self‑service platform capabilities.
  • Embed DevSecOps controls into cloud and platform pipelines, including policy‑as‑code, secrets management, image scanning, and compliance automation.
  • Understand and ensure risk management, security, and compliance requirements for supported platforms and partner with security, risk, and audit teams to implement key initiatives.
  • Make decisions in complex, ambiguous, and multifaceted situations involving multiple concurrent infrastructure, cloud, and platform initiatives.
  • Collaborate and consult with technology teams and senior leadership to resolve infrastructure and reliability issues and deliver optimal cloud‑native solutions.
  • Provide technical leadership and guidance on leveraging new cloud, container, and platform technologies aligned with enterprise standards.
  • Manage, coach, and develop a team or teams of experienced engineers and engineering managers in roles with moderate complexity and risk, responsible for building high quality capabilities with modern technology
  • Ensure adherence to the Banking Platform Architecture, and meeting non-functional requirements with each release
  • Partner with, engage and influence architects and experienced engineers to incorporate Wells Fargo Technology technical strategies, while understanding next generation domain architecture and enable application migration paths to target architecture; for example cloud readiness, application modernization, data strategy
  • Function as the technical representative for the product during cross-team collaborative efforts and planning
  • Identify and recommend opportunities for driving escalated resolution of technology roadblocks including code, build and deployment while also managing overall software development cycle and security standards
  • Determine appropriate strategy and actions to act as an escalation partner for scrum masters and the teams to meet moderate to high risk deliverables and help remove impediments, obstacles, and friction while encouraging constant learning, experimentation, and continual improvement
  • Build engineering skills side-by-side in the codebase, conduct peer reviews to evaluate quality and solution alignment to technical direction, and guide design, as needed
  • Interpret, develop and ensure security, stability, and scalability within functions of technology with moderate complexity, as well as identify, manage and mitigate technology and enterprise risk
  • Collaborate with, partner with and influence Product Managers/Product Owners to drive user satisfaction, influence technology requirements and priorities in the product roadmap, promote innovative and intelligent solutions, generate corporate value and articulate technical strategy while being a solid advocate of agile and DevOps practices
  • Interact directly with third party vendors and technology service providers
  • Manage allocation of people and financial resources to ensure commitments are met and align with strategic objectives in technology engineering
  • Hire, build and guide a culture of talent development to have the skills required to effectively design and deliver innovative solutions for product areas and products to meet business objectives and strategy, as well as conduct performance management for engineers and managers

Benefits

  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance, critical illness insurance, and accident insurance
  • Parental leave
  • Critical caregiving leave
  • Discounts and savings
  • Commuter benefits
  • Tuition reimbursement
  • Scholarships for dependent children
  • Adoption reimbursement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service