About The Position

About this role: Wells Fargo is seeking a Senior Wires Support Engineer in Technology as part of Commercial Corporate & Investment Bank Technology (CCIBT) engineering and transformation office. Learn more about the career areas and lines of business at wellsfargojobs.com . The position supports Money Transfer System applications (Domestic and International). As a part of the support, the team member is expected to provide day to day support / operational activities related to EMTS and GMTS. This includes alert handling, taking care of the incidents, handling start of day / end of day activities, Service requests, Changes and Problem management. In This Role, You Will Platform & Reliability Engineering Embed SRE and production engineering principles into Payments Modernization from design through early life support Define and validate non-functional requirements (NFRs) covering resilience, scalability, observability, recovery, and operability Drive replay, retry, and exception-handling validation for event-driven payment flows Lead capacity and performance testing, including volume growth and peak event scenarios (e.g. FedNow, CHIPS, SWIFT) Service Transition & Operational Readiness Own Permit-to-Operate readiness across environments (NFR Testing) Define cutover, shadow support, and early life support models Ensure runbooks, support procedures, on-call readiness, and escalation paths are production-grade before go-live Partner with Change Assurance to apply risk-based release controls, canary/blue-green strategies, and rollback automation Observability & Stability Implement end-to-end observability across Kafka, MongoDB, API layers, and downstream payment components Define and monitor SLOs, error budgets, and golden signals Reduce alert noise through signal design, correlation, and automation Analyze early defects and exception patterns (ACK/NACKs, business errors) to drive stabilization Chaos Engineering & Continuous Improvement Design and execute controlled failure testing (chaos engineering) to validate recovery patterns and blast radius Lead blameless RCAs, ensuring corrective actions are owned and recurrence is prevented Drive continuous service improvement (CSI) initiatives, including automation, resilience uplift, and technical debt reduction

Requirements

  • 4+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 2+ years of application support experience
  • 2+ years of Application Frameworks experience in Spring Boot, Spring WebFlux, etc.
  • 2+ years of Data Stores & Caching experience with MongoDB, Redis
  • 2+ years of Platform experience in Kubernetes / container orchestration
  • 2+ years of CI/CD & Automation experience in Progressive delivery, automated rollback, reliability-as-code concepts

Nice To Haves

  • 2+ years of Resilience experience with Resilience4J, retry/replay patterns
  • 2+ years Observability: Distributed tracing, metrics, logging, SLO tooling
  • 2+ years Testing & Resilience Validation: BlazeMeter, Chaos Monkey
  • Strong experience in SRE, Production Engineering, Platform Engineering, or Service Transition within a complex technology or financial services environment
  • Demonstrated ability to productionize new platforms, not just support them
  • Solid understanding of high-value payment systems (Wires, RTP, SWIFT, CHIPS, FedNow) and their operational risk profile
  • Experience working with event-driven, distributed architectures
  • Proven ability to partner with engineering teams while representing the production and operational lens
  • Comfortable operating in early-stage, ambiguous transformation environments
  • Strong communication skills, with the ability to explain technical risk to senior stakeholders

Responsibilities

  • Embed SRE and production engineering principles into Payments Modernization from design through early life support
  • Define and validate non-functional requirements (NFRs) covering resilience, scalability, observability, recovery, and operability
  • Drive replay, retry, and exception-handling validation for event-driven payment flows
  • Lead capacity and performance testing, including volume growth and peak event scenarios (e.g. FedNow, CHIPS, SWIFT)
  • Own Permit-to-Operate readiness across environments (NFR Testing)
  • Define cutover, shadow support, and early life support models
  • Ensure runbooks, support procedures, on-call readiness, and escalation paths are production-grade before go-live
  • Partner with Change Assurance to apply risk-based release controls, canary/blue-green strategies, and rollback automation
  • Implement end-to-end observability across Kafka, MongoDB, API layers, and downstream payment components
  • Define and monitor SLOs, error budgets, and golden signals
  • Reduce alert noise through signal design, correlation, and automation
  • Analyze early defects and exception patterns (ACK/NACKs, business errors) to drive stabilization
  • Design and execute controlled failure testing (chaos engineering) to validate recovery patterns and blast radius
  • Lead blameless RCAs, ensuring corrective actions are owned and recurrence is prevented
  • Drive continuous service improvement (CSI) initiatives, including automation, resilience uplift, and technical debt reduction

Benefits

  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance, critical illness insurance, and accident insurance
  • Parental leave
  • Critical caregiving leave
  • Discounts and savings
  • Commuter benefits
  • Tuition reimbursement
  • Scholarships for dependent children
  • Adoption reimbursement

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service