Tech Lead – Data Engineer - Assets and Cash Flow

Vanguard•Malvern, PA

10d•Hybrid

About The Position

Vanguard is seeking a Tech Lead that is passionate about Vanguard and our funds. In this role, you will lead the system which articulates assets and cash flow across each of Vanguard's lines of business. This role will function as a critical liaison with Finance and Technology teams and will engage with clients across the enterprise. We’re seeking a hands-on Tech Lead to own the design, build, and reliability of our modern data ETL pipelines and lakehouse platform. This leader will partner with product, analytics, and platform engineering to deliver resilient, cost‑effective data solutions that power high‑quality insights. The ideal candidate is deeply experienced with AWS serverless data services—Lambda, Glue, and Step Functions—and has production expertise operating Apache Iceberg tables at scale (schema evolution, partitioning, compaction, snapshot management, and metadata performance). You will set engineering standards, mentor developers, and drive the roadmap for data ingestion, transformation, and consumption across batch and event-driven pipelines.

Requirements

8+ years of related data engineering experience with a strong record of leading delivery for production ETL platforms.
5+ years building on AWS with deep expertise in Lambda, Glue, and Step Functions (state machine patterns, error handling, retries, and parallel branches).
Strong proficiency in Python and SQL
Hands‑on experience with batch and event‑driven pipelines (e.g., Kinesis/Kafka), S3‑based data lakes, and Glue/Athena/EMR integration.
Solid understanding of lakehouse architecture, columnar formats (Parquet), and performance tuning (file sizing, predicate pushdown, metadata operations).
Familiarity with data quality frameworks, lineage/metadata management, and governance (IAM, Lake Formation, encryption at rest/in transit).
Excellent communication skills; able to explain complex technical concepts to non‑technical audiences and influence cross‑functional decisions.
Undergraduate degree or equivalent combination of training and experience; graduate degree preferred.

Nice To Haves

Operating Iceberg with query engines (Athena/Trino/Presto/Spark) and catalog integrations (Glue/HMS).
Experience in financial services or similarly regulated industries.

Responsibilities

Leads end‑to‑end architecture and delivery of data pipelines.
Defines patterns for ingestion, transformation, storage, and serving across batch/streaming workflows; selects the right tools and designs SLAs/SLOs for availability, latency, and cost.
Designs and implements serverless data solutions using AWS Lambda, Glue, and Step Functions.
Builds Glue Jobs (Spark/Python) and orchestrates state machines for complex dependency management, error handling, retries, and parallelization.
Owns Apache Iceberg table design and lifecycle.
Establishes standards for partitioning, clustering, snapshot retention, compaction, merge/delete, schema evolution, and Glue/AWS Lake Formation catalog management.
Ensures performance, reliability, and observability.
Implements robust monitoring, logging, and alerting (e.g., CloudWatch, metrics, tracing) and leads incident response, root‑cause analysis, and continuous improvement.
Elevates code into dev/test/prod safely and on schedule.
Champions CI/CD and IaC (CloudFormation/CDK/Terraform), automated testing (unit/integration/data quality), canary deployments, and rollback strategies.
Establishes data quality and governance controls.
Builds validation frameworks, contracts, and lineage; integrates with catalog, access controls (IAM/Lake Formation), encryption, and compliance requirements.
Mentors engineers and sets engineering standards.
Provides code reviews, technical coaching, and career growth guidance; fosters a culture of high‑quality design, documentation, and collaborative delivery.
Partners with stakeholders to translate business needs into scalable data products.
Works closely with analytics, product, and platform teams to prioritize backlogs, define SLAs, and deliver user‑centric datasets and APIs.
Optimizes cost and performance across the stack.
Tunes Glue (workers, DPUs, job bookmarks), Lambda (memory/timeout/concurrency), Iceberg (file sizing, compaction cadence), and storage/query engines (e.g., S3/Athena/EMR).
Participates in special projects and performs other duties as assigned.