Senior Data Engineer (GenAI & LLM Infrastructure)

ProsciaPhiladelphia, PA
6dHybrid

About The Position

As a Senior Engineer, you will contribute to Proscia’s growing Real World Data (RWD) business, which operates as a “startup within a startup.” In this entrepreneurial environment, you will help build and scale data + AI systems that drive better outcomes for cancer patients and support cutting-edge research into therapies and drug regimens. Success in this role requires independence, adaptability, and close collaboration with cross-functional teams—bringing a “build the plane while flying it” mindset while staying aligned with broader engineering initiatives.

Requirements

  • Strong experience building production systems in Python.
  • Demonstrated experience delivering GenAI/LLM solutions into production (beyond experimentation), such as structured extraction pipelines, retrieval/embedding-based systems, or LLM-powered analytics workflows.
  • Experience owning the LLM lifecycle in production: prompt/model versioning, evaluation/regression testing, monitoring, and controlled releases.
  • Experience shipping an LLM-enabled workflow end-to-end (design → build → deploy → operate).
  • Experience building systems where outputs are testable, traceable, and reproducible (evidence references, versioning, run logs).
  • A pragmatic approach to reliability—handling ambiguity, conflicts, and change without breaking downstream analytics.
  • Solid fundamentals in SQL, data modeling, and data warehouse patterns; experience with Snowflake or similar platforms.
  • Software engineering practices: unit/integration testing, CI/CD, and containerization (Docker; Kubernetes).
  • Experience with cloud platforms (AWS preferred).
  • Comfort selecting and integrating the right tools to build, evaluate, deploy, and operate LLM workflows in production (we’re tooling-agnostic and prioritize end-to-end delivery over specific frameworks).
  • The ability to work independently, move quickly, and collaborate effectively across teams.

Nice To Haves

  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (Master’s preferred).
  • Experience in life sciences/biopharma is a plus (domain experience is helpful, but not required)

Responsibilities

  • Build and deploy LLM-enabled data products/workflows that turn structured + unstructured inputs into curated, research-ready outputs.
  • Develop and refine data pipelines and warehouse layers (raw → curated → marts) to support both analytics and AI workflows.
  • Implement LLMOps/MLOps foundations: evaluation, versioning, monitoring/observability, and safe release processes for model/prompt changes.
  • Deliver traceable and reproducible outputs (evidence references, run metadata, input/version tracking) so results can be explained and debugged.
  • Identify and implement process improvements—automation, reliability controls, and quality checks—to accelerate delivery and reduce manual effort.
  • Collaborate with core engineering, AI, and RWD stakeholders to align technical strategy and integrate solutions into the broader Proscia platform.

Benefits

  • Along with competitive pay, we provide comprehensive benefits, flexible schedules, and insurance options to promote long-term health and personal growth.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service