Staff Software Engineer | Semantic Data Lake

WEXWEXUS•San Jose, CA

9d•$140,600 - $173,100•Remote

About The Position

WEX is transforming its enterprise data platform to convert raw data into meaningful, reusable, and trusted business assets. As a Staff Software Engineer on the Semantic Data Lake Team, you will be instrumental in designing, building, and maintaining core 360 data objects like Customer360, Fleet360, and Provider360. These entity-based tables are crucial for our analytics, AI, and product platforms. You will implement transformation logic, encode business rules, and ensure data consistency across domains, making data models both technically scalable and business-ready. This role is central to WEX's DaaS platform, connecting raw data with business insights and defining the semantic backbone for products, analytics, and machine learning systems. The ideal candidate is an AI-native engineer, proficient in using modern AI coding tools (Claude, Copilot, Cursor, etc.) and Spec-Driven Development (SDD) daily to accelerate design, generate and refactor transformation logic, write tests, document semantics, and explore data, all while applying engineering judgment for production-grade data assets. This is an opportunity to build semantic models that carry real-world meaning, scale significantly, and unify business understanding, leveraging modern AI tooling.

Requirements

8+ years of experience in data engineering or software engineering with a focus on data transformation, modeling, or analytics platforms.
Strong proficiency in SQL and at least one general-purpose language such as Python or Scala.
Demonstrated experience as an AI-native engineer—using tools like Claude, GitHub Copilot, Cursor, or similar as part of your everyday development workflow, with a clear point of view on where they accelerate your work and where human judgment is essential.
Comfort with modern AI engineering practices such as prompt design, context engineering, Spec-Driven Development (SDD), AI-assisted code review, and integrating LLMs or AI agents into engineering or data workflows.
Experience building and scaling wide, entity-based tables and modeling domain concepts (e.g., customer, fleet, provider) into durable data objects.
Solid understanding of data quality practices—including validation, enrichment, schema enforcement, and business rule encoding.
Experience working with large-scale datasets and optimizing transformation pipelines for performance and maintainability.
Comfort operating in a collaborative, cross-functional environment, balancing business logic with platform scalability.
A mindset for traceability, reproducibility, and semantic clarity—you build data models others (humans and AI systems alike) can trust and reuse.
Bachelor's degree in Computer Science, Software Engineering, or related field.

Nice To Haves

A Master's or PhD in Data Science, Machine Learning, Artificial Intelligence, Computer Science, or Statistics is a big plus.

Responsibilities

Design and implement semantically consistent, scalable 360 data models that integrate data across domains.
Build and maintain transformation pipelines that apply cleansing, standardization, enrichment, and derived logic to domain datasets.
Write production-quality, testable code in SQL and Python (or equivalent)—delivering performant and maintainable data assets.
Leverage AI coding assistants (Claude, Copilot, Cursor, and similar) to accelerate development—drafting transformation logic, generating tests, refactoring pipelines, exploring datasets, and producing semantic documentation—while critically reviewing AI output for correctness, performance, and alignment with business rules.
Develop and share patterns, prompts, and workflows that help the team get more leverage out of AI tooling, raising the bar for AI-native engineering practices across the Semantic Data Team.
Work closely with domain experts, data scientists, and product stakeholders to translate business concepts into interpretable, decision-ready data models.
Implement logic for classifications, KPIs, scoring algorithms, and business rules, ensuring traceability and data lineage.
Help define and enforce standards for data modeling, documentation, and governance within the semantic layer—including standards for responsible, auditable use of AI-generated code and artifacts.
Collaborate across teams to integrate with ingestion, MDM, and data product layers, and explore opportunities to expose 360 objects to LLM-powered and agentic applications.