Data Engineer - AI, Agents, & Context - Clinical (Sr. Associate)

Huron•Chicago, IL

2d•Remote

About The Position

This role sits within a strategic investment to embed AI into how we operate, serve customers, and make decisions within our healthcare business. We're building a healthcare-wide AI data and context platform with a focus on deep domain expertise embedded throughout our architecture. Our goals are: Turn structured and unstructured information into trusted, reusable "building blocks" (semantic layers, retrieval services, and agent-ready interfaces) that accelerate product innovation. Deliver transformational speed and leverage — faster time-to-insight, higher automation of knowledge work, and a foundation that scales AI safely and reliably as adoption grows. Unlock new capabilities across our business and create the foundation that drives deeper domain innovation and cross-domain collaboration. This is a hands-on technical contributor who builds and maintains core AI/context data capabilities. The role executes key parts of the AI context platform — unstructured ingestion, embeddings, retrieval, and semantic layers — working closely with senior engineers and cross-functional partners to ship reliable, production-grade AI data products.

Requirements

3–6 years in data engineering or data platform roles with strong hands-on delivery
Strong SQL and Python (or Scala/Java); solid production engineering habits
Experience designing and operating cloud data pipelines at scale
Experience working with unstructured data processing and search/retrieval concepts
Clear communicator who can work effectively across technical and functional teams

Nice To Haves

Hands-on experience with vector search and embeddings (pgvector/Pinecone/Weaviate/OpenSearch/Elastic) and retrieval patterns (semantic retrieval, hybrid search, reranking)
Experience supporting LLM applications (RAG, agent tool interfaces, evaluation/observability)
Familiarity with knowledge graphs/semantic modeling or metrics layers
Experience in regulated environments and data governance programs

Responsibilities

Build and contribute to the AI context platform
Implement end-to-end pipelines: ingestion → parsing/chunking → enrichment → embeddings → vector indexing → retrieval/serving
Build and maintain patterns for incremental refresh, backfills, re-embeddings, deduplication, and lineage across unstructured sources
Contribute to retrieval quality improvements (query strategies, hybrid search, metadata filtering) in partnership with AI engineers
Deliver semantic and governed data products
Implement semantic layers (metrics/entities) that power BI and agent reasoning consistently
Apply established data contracts and context contracts for AI inputs (schemas, metadata requirements, freshness, citation expectations)
Ensure datasets and indexes are documented and reusable
Support reliability and performance across assigned workstreams: monitoring, alerting, runbooks, and incident response
Contribute to cost and latency optimization across warehouse/lakehouse and vector infrastructure
Apply security-by-design patterns: RBAC/ABAC, PII redaction, retention controls, and audit logging
Follow established guardrails for AI access to enterprise knowledge in coordination with Security/Legal/Compliance

Benefits

medical, dental and vision coverage
401(k) plan with a generous employer match
employee stock purchase plan
generous Paid Time Off policy
paid parental leave
adoption assistance
free annual health screenings and coaching
bank at work
on-site workshops
ongoing programs recognizing major events in the lives of our employees throughout the year