Technology Operations Specialist ll

Bank of America•Plano, TX

1d•Onsite

About The Position

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day. Being a Great Place to Work and providing a culture of caring is core to how we drive Responsible Growth. We are intentional about fostering an inclusive workplace where every teammate has the opportunity to succeed, build a career and contribute to our shared success. This includes attracting and developing exceptional talent, recognizing and rewarding performance, and supporting our teammates’ physical, emotional, and financial wellness through affordable, competitive and flexible benefits. We value the unique perspectives individuals bring from all backgrounds and career paths - whether shaped by military service, community college education, or a wide range of work and life experiences. These journeys foster resilience, leadership and innovation, strengthening our workforce and positively impact the communities we serve. Bank of America is committed to an in-office culture that supports collaboration, engagement, and career development. Our approach includes clear in-office expectations, while providing an appropriate level of flexibility based on role-specific responsibilities and business needs. At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us! This job is responsible for leading the planning, designing, and implementation of complex infrastructure solutions to meet deployment requirements aligned with available playbooks and technical blueprints. Key responsibilities include providing and communicating technology solutions across audiences and overseeing projects and activities related to special initiatives or operations. Job expectations include leading the resolution process for problems, adhering to defined practices and policies to obtain results, and establishing input/output processes and working parameters for systems.

Requirements

5 -6 + years of IT experience in production support, operations, automation, or SRE‑aligned roles, supporting business‑critical enterprise applications.
Strong hands‑on experience with AutoSys job scheduling, production monitoring, and troubleshooting in complex environments.
Proven expertise with Unix/Linux, Oracle/SQL, and scripting for operational support and automation.
Hands‑on experience developing automation using Python and/or PowerShell, with exposure to Machine Learning / AI concepts applied to operations, monitoring, or analytics.
Experience with automation and deployment tools such as Ansible, Cutover, BladeLogic, or equivalent orchestration platforms.
Working knowledge of Site Reliability Engineering (SRE) concepts, including reliability KPIs, SLIs/SLOs, error budgets, incident response best practices, and automation‑first operations.
Strong experience in incident triage, root cause analysis, problem management, and SLA‑driven resolution.
Experience with logging, monitoring, and observability tools such as Splunk, including proactive alerting and troubleshooting.
Solid understanding of SDLC, including development, testing, release, and promotion of automation solutions into production.
Hands‑on experience with Linux and Windows operating systems and strong proficiency with MS Office tools.
Experience with reporting and analytics tools such as Tableau and/or MicroStrategy.
Ability to coordinate production changes and releases across development, infrastructure, security, and business teams.
Strong verbal and written communication skills; ability to work effectively with diverse stakeholders and escalate when prioritization is a challenge.
Demonstrated ownership mindset with the ability to work independently, manage competing priorities, and meet tight deadlines.

Nice To Haves

Experience applying AI/ML techniques to operational data (alert noise reduction, predictive analytics, self‑healing workflows).
Reporting and visualization experience using Tableau and/or MicroStrategy.
Strong automation‑first mindset with a track record of eliminating manual processes.
Ability to work independently in a dynamic, ambiguous, and globally distributed environment.

Responsibilities

Fulfills requests from business users and operations, communicates technical status updates with appropriate teams, and oversees stability, resiliency, reliability, and the performance of multiple supported systems
Mentors other team members and provides technical leadership
Captures and translates business requirements into complex infrastructure and/or system design for specific implementations and collaborates with technology stakeholders, Solution Delivery Management teams, Technology Project Management teams, Solutions Engineering teams, and technical service providers for system design and deployment
Supports change implementations, proactively identifies and resolves potential issues resulting from the changes, and performs access and/or physical provisioning/deprovisioning (additions, modifications, and deletions) for infrastructure and applications
Provides consulting services to Core Technology Infrastructure (CTI) and technical partners, executes procedures reliably, and escalates appropriately to solve incidents quickly
Provides release support when needed and manages engagement across audiences
Provides full lifecycle management of the infrastructure and application environments
Own monitoring and support of production AutoSys job schedules, ensuring system stability, performance, and resilience
Design and develop automation solutions to reduce manual operational activities using tools such as AutoSys, Ansible, and scripting frameworks
Apply SRE principles to production support, including error budgets, reliability metrics, automation of operational toil, and continuous service improvement
Apply Python and AI/ML based techniques for intelligent alerting, anomaly detection, automated incident triage, and predictive failure analysis
Perform hands‑on incident triage, root cause analysis, and resolution while continuously identifying opportunities to automate recurring issues and operational workflows
Coordinate production changes and releases across development, infrastructure, security, and business partners, ensuring SDLC compliance and production readiness
Develop, test, and promote automation and reporting solutions into production environments, following enterprise SDLC standards
Implement monitoring, logging, and reporting dashboards to provide operational insights and trend analysis.
Support business continuity and disaster recovery activities, including automation of recovery procedures where feasible
Document automated processes, operational runbooks, and AI‑assisted support procedures to enable scale and knowledge sharing