About The Position

The Human Genomics and Translational Data Sciences team within Cardiometabolic Research Data Science is hiring a Bioinformatics Pipeline Engineer to help build, solidify, and scale the analytical pipelines our scientists rely on every day. Our work spans multiple omics workflows, including target discovery and target due diligence, single cell sequencing, genomics, proteomics and, increasingly, AI-assisted workflows that pull these analyses together into faster, more reproducible products for therapeutic area partners across Lilly Research Labs. This role sits at the intersection of two worlds. On one side, we employ classical bioinformatics and statistical genetics pipelines — the kind of robust, reproducible, well-tested workflows that turn messy public and proprietary genomics data into trustworthy answers. On the other, the rapidly evolving stack of AI tooling — large language models like Claude, agentic workflows, building AI-friendly connectors like MCP (Model Context Protocol), and the code that lets scientists query complex datasets in natural language. We want someone who is genuinely curious about both, and keen to use both to improve the value we derive from our datasets to enable target support and novel target discovery. You will not be expected to be a senior expert in either domain on day one. You will be expected to bring strong software engineering instincts, and a keen curiosity and creativity to enhance the value of the tools and datasets at our disposal. You will work closely with statistical geneticists, computational biologists, and other engineers — both within our team and across Lilly — to ship tools that make the science faster and more reliable.

Requirements

  • B.S. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 10+ years relevant work experience
  • OR M.S. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 7+ years relevant work experience
  • OR Ph.D. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 1+ years relevant work experience.
  • Strong programming skills in Python and/or R including comfort with version control (Git), code review, testing, and writing maintainable code
  • Demonstrated ability to build stable and practical, reusable workflows and not just code for one-off analyses, with strong implementation skills in Python and modern AI/ML tooling
  • A collaborative, low-ego mentality; you enjoy building tools that other people use and you take feedback well
  • Comfort with cloud computing environments (AWS, GCP, or Azure) and Linux/command-line work
  • Ability to work successfully in a matrixed environment

Nice To Haves

  • Demonstrated experience building data analysis pipelines, ideally using a workflow manager such as Nextflow, Snakemake, or WDL
  • Working familiarity with bioinformatics file formats (VCF, BED, GTF, BAM, etc.) and standard tools (PLINK, samtools, bcftools, or similar)
  • Familiarity with typical data types in high-throughput biology, including NGS data
  • Hands-on experience or strong demonstrated interest in modern AI tooling — using LLMs through APIs, building MCP servers/connectors, prompt engineering, or wiring up agentic workflows
  • Prior experience with statistical workflows/biomedical statistics
  • Prior exposure to statistical genetics methods (GWAS, fine-mapping, MR, colocalization, burden testing) or large-scale genomic datasets (UK Biobank, gnomAD, GTEx, Open Targets)
  • Prior experience with complex high-throughput biological data or experiments such as spatial transcriptomics, large-scale screens, or multi-omics studies
  • Familiarity with R in addition to Python, particularly for statistical genetics packages
  • Experience with relational and/or graph databases, and with biomedical ontologies
  • Contributions to open-source projects or a public portfolio (GitHub, blog posts, demos)
  • Prior experience in pharma, biotech, or academic genomics research

Responsibilities

  • Support for computational biology workflows, including single cell, spatial, and other multi-omics analysis workflows for clinical and preclinical applications
  • Use modern workflow managers (e.g. Nextflow, Snakemake, or similar) and containerization (Docker, Singularity) to make pipelines portable, testable, and reusable across projects and teams
  • Help build and maintain reproducible analytical pipelines for statistical genetics and bioinformatics workflows
  • Wrap and harden ad-hoc analytical scripts written by scientists into production-quality tools that can be re-run reliably by others
  • Write tests, documentation, and clear examples so the pipelines you build are usable by colleagues with a range of technical backgrounds
  • Prototype agentic workflows that automate established and routine analytical tasks — for example, pulling target evidence across data sources, generating standardized due-diligence reports, or letting scientists interrogate complex datasets in natural language
  • Build and maintain MCP connectors that expose internal data, public resources, and analytical pipelines to LLM-based agents and tools like Claude
  • Identify and develop use cases where LLMs and agentic AI workflows can improve the speed, quality, consistency, or accessibility of work across therapeutic areas, focusing on end-to-end capabilities rather than isolated task completion
  • Contribute to a shared library of reusable AI tooling, prompt patterns, and integration code that the team can build on. Define technical standards for evaluation, documentation, guardrails, and workflow quality so that AI-based solutions are trusted, reproducible, and suitable for repeated use across teams and projects
  • Know the latest with the AI tooling landscape and bring back ideas the team can put to work. Help improve AI fluency among collaborators by demonstrating practical workflows
  • Partner closely with statistical geneticists, computational biologists, and software engineers within the Cardiometabolic Data Science group and across other Lilly Research Labs teams
  • Work with therapeutic area partners to understand their analytical needs and translate them into pipeline requirements
  • Coordinate with platform and engineering groups to ensure your pipelines integrate cleanly with broader Lilly infrastructure
  • Contribute to internal knowledge sharing — code reviews, demos, documentation, and helping colleagues get unblocked

Benefits

  • company bonus (depending, in part, on company and individual performance)
  • company-sponsored 401(k)
  • pension
  • vacation benefits
  • medical, dental, vision and prescription drug benefits
  • flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts)
  • life insurance and death benefits
  • certain time off and leave of absence benefits
  • well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service