Lead AI Engineer

Wells Fargo Bank•Irving, TX

14h

About The Position

Shape the Future of Enterprise Operations with Generative AI COO Technology sits at the heart of the enterprise, powering the systems that keep the Chief Operating Office running at scale—from strategic execution, resiliency, and regulatory enablement to customer experience, supply chain, and shared services. This team drives large‑scale modernization, building resilient, data‑driven platforms that enable operational excellence across the firm. We are seeking a Lead Specialty Software Engineer to play a pivotal role in defining and delivering next‑generation Generative AI capabilities within this mission‑critical environment. This is a deeply hands‑on engineering role for someone who thrives at the intersection of distributed systems, streaming data, and AI at enterprise scale. You will architect and build production‑grade AI pipelines, embed modern observability into LLM and VLM (Vision Language Model) systems, and engineer for reliability, performance, and cost transparency from day one. In this role, you will design scalable ingestion and streaming orchestration for high‑volume clickstream and case‑lifecycle data, deploy and operate containerized services through CI/CD and infrastructure‑as‑code, and bring rigor to AI systems through comprehensive monitoring, tracing, and resiliency patterns. You’ll also work with video and transcript data, enabling powerful downstream analysis, while seamlessly integrating with evaluation and analytics workstreams. The result: AI platforms that are not only innovative, but trusted, explainable, and built to run the business.

Requirements

5+ years of Specialty Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education.
2+ years of experience with streaming platforms for data orchestration.
2+ years of experience with containerization and container orchestration platforms
2+ years of deploying and operating ML/AI models in production environments.
3+ years of experience programming with Python.

Nice To Haves

5+ years of experience in software engineering with focus on data pipelines and distributed systems.
Experience with LLMOps/MLOps frameworks for observability.
Experience with Vision Language Models (VLMs) or multi-modal AI systems.
Experience with enterprise container platforms.
Background in video processing or computer vision pipelines.
Experience with observability tools and platforms.
Experience building resilient systems with patterns like circuit breakers, bulkheads, and retry policies.
Experience with cloud computing platforms.
Knowledge of infrastructure-as-code tools.
Experience with CI/CD pipelines.
Understanding of data quality and validation frameworks.
Experience in financial services or enterprise environments.
Excellent communication skills across technical and non-technical audiences.

Responsibilities

Pipeline Development: Build scalable, robust ingestion pipelines for processing clickstream data, handling both participant-level (full-day employee activity) and case-level (end-to-end case lifecycle) data streams.
Streaming Orchestration: Design and implement streaming platform-based orchestration for pipeline coordination, ensuring reliable data flow and processing guarantees.
Container Platform Deployment: Deploy and manage containerized services on enterprise container platforms, implementing CI/CD pipelines and infrastructure-as-code practices.
AI Model Observability: Implement comprehensive observability for LLM and VLM pipelines, including: Performance monitoring and metrics collection. Distributed tracing for multi-model pipelines. Logging and alerting for model inference. Cost tracking and optimization.
Resiliency Engineering: Build fault-tolerant systems with retry mechanisms, circuit breakers, dead letter queues, and graceful degradation patterns.
Video/Transcript Processing: Work with VLMs to process clickstream video data and generate high-quality transcripts for downstream analysis.
Integration: Ensure seamless integration with downstream Analysis & Evaluation workstream.