Lead Software and Data Engineer

William Blair•Chicago, IL

1d•Hybrid

About The Position

Solutions for Today’s Challenges. Vision for Tomorrow’s Opportunities. Join William Blair, the Premier Global Partnership. The Investment Banking division has built a differentiated AI foundation: proprietary ML models integrated into CRM workflows, an Azure-based analytics stack, and generative AI solutions deployed across 650+ bankers globally. We are now scaling our Investment Banking AI & Technology team to accelerate the integration of next-generation AI capabilities—anchored by frontier LLMs as the central reasoning engine, augmented by best-in-class point solutions for research and deal execution—into every stage of the banking workflow. We are looking for a Lead Software & Data Engineer to join the Investment Banking AI & Technology team. You will own the reliability and evolution of our microservices architecture, data pipelines, and data models—ensuring the systems that underpin investment banking workflows are robust, scalable, and fit for purpose. Data engineering and data modeling are central to this role. You will design and maintain the pipelines, schemas, and orchestration patterns that move and shape data across the platform—working closely with stakeholders to ensure data is accurate, accessible, and structured to support downstream analytics and reporting. You will bring rigorous engineering standards to everything you ship: clean code, tested pipelines, and systems built for longevity, not just the immediate need.

Requirements

Bachelor’s degree in Computer Science, Software Engineering, or a related field.
5+ years of software and data engineering experience with a strong foundation in production systems that serve demanding end users.
Strong Python proficiency for data engineering: writing clean, production-grade pipeline code, scripting, and package management—Python is the primary development language for this role.
Hands-on experience with Dagster for pipeline orchestration: asset-based pipelines, sensors, schedules, and ideally Dagster+ Cloud deployments.
Hands-on Azure Synapse experience: Synapse Pipelines, Notebooks, Linked Services, and Integration Runtimes—migration experience strongly preferred.
Solid general data engineering fundamentals: data modeling, ETL/ELT design, pipeline orchestration, data lake architecture (ADLS Gen2), and SQL.
Working knowledge of PySpark—DataFrames, Spark SQL—sufficient to support and maintain existing distributed workloads.
Experience deploying and managing containerized services on Kubernetes, including API microservice development patterns.
Comfort working with AI-assisted development tools (e.g. Claude Code, GitHub Copilot) and a track record of using them to ship higher-quality work faster.
Rigorous engineering practices: you write tested, reviewed, well-documented code and build systems designed for maintainability, not just demos.
Familiarity with capital markets, and preferably direct experience in or adjacent to investment banking, private equity, venture capital, or hedge funds.
Experience with Azure cloud infrastructure and Databricks/Spark platform.
Outcome orientation—you measure success by business impact delivered, not features shipped.

Nice To Haves

Familiarity with Dagster asset-based orchestration, sensors, schedules, and ideally Dagster+ Cloud deployments.
Prior work with Salesforce APIs, SOQL, or CRM integration patterns.
Contributions to engineering culture: mentoring, establishing best practices, or leading technical design reviews in a small-team environment.

Responsibilities

Design, build, and maintain production-grade data pipelines in Python—writing clean, modular, well-tested pipeline code that can be owned and extended by the broader team.
Lead the migration of existing data processes from Azure Synapse to Dagster, including Synapse Pipelines, Notebooks, Linked Services, and Integration Runtimes, ensuring continuity of data flows throughout the transition.
Own and evolve Dagster-based orchestration: asset-based pipelines, sensors, schedules, and Dagster+ Cloud deployments as the primary orchestration standard going forward.
Architect and maintain a scalable data lake on ADLS Gen2, including data modeling, ETL/ELT design, and schema governance appropriate for confidential deal information.
Support PySpark workloads where required—DataFrames, Spark SQL, and performance tuning—with the majority of pipeline development delivered in Python.
Develop Kubernetes-based microservices for data and API workloads, ensuring reliable, scalable deployment of pipeline and application components.
Set engineering standards for the Innovation Team: code review practices, CI/CD pipelines, testing frameworks, and documentation norms that enable speed without sacrificing reliability.
Leverage AI-assisted development tools—including agentic coding environments such as Claude Code—to accelerate prototyping, code generation, and pipeline development; champion their effective adoption across the team.
Work directly with deal teams and industry/sector groups to understand workflows, identify automation opportunities, and iterate on deployed tools based on real-world banker feedback.
Perform rapid analysis and prototyping—translate a banker's pain point into a working proof of concept within days, not weeks.
Implement security and data governance protocols appropriate for confidential deal information.