Data Engineer

Eli Lilly and CompanyUs, IN
5d

About The Position

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world. What You Will Do A Data Engineer is responsible for designing, developing, and maintaining the data solutions that ensure the availability and quality of data for analysis and/or business transactions. They design and implement efficient data storage, processing and retrieval solutions for datasets and build data pipelines, optimize database designs, and work closely with data scientists, architects, and analysts to ensure data quality and accessibility. Data engineers require strong skillsets in data integration, acquisition, cleansing, harmonization, and transforming data. They play a crucial role in transforming raw data into datasets designed for analysis which enable organizations to unlock valuable insights for decision making. Design, build, and maintain scalable and reliable data pipelines for batch and real-time processing. Own incident response and resolution , including root cause analysis and post-mortem reporting for data failures and performance issues. Develop and optimize data models, ETL/ELT workflows, and data integration across multiple systems and platforms. Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver solutions. Implement data governance, security, and quality standards across data assets. Lead end-to-end data engineering projects and contribute to architectural decisions. Design and implement cloud-native solutions on AWS (preferred) using tools such as AWS Glue, EMR, and Databricks. Experience with Azure or GCP is a plus. Promote best practices in coding, testing, and deployment. Monitor, troubleshoot, and improve performance and reliability of data infrastructure. Automate manual processes and identify opportunities to optimize data workflows and reduce costs. Your Basic Qualifications : Bachelor’s degree At least 2 years of experience in data engineering using core technologies such as SQL, Python, PySpark, and AWS services including Lambda, Glue, S3, Redshift, Athena, and IAM roles/policies. 1+ years of experience working in Agile environments, with hands-on experience using GitHub and CI/CD pipelines for code deployment. 1+ years of experience with orchestration tools like Airflow for workflow automation. Proven experience in architecting and building high-performance, scalable data pipelines following Data Lakehouse, Data Warehouse, and Data Mart standards. Strong expertise in data modelling ( both OLTP and OLAP ), managing large datasets, and implementing secure, compliant data governance practices. Hands-on experience with Databricks, including cluster management, workspace configuration, notebook development, and performance optimization. Qualified candidates must be legally authorized to be employed in the United States. Lilly does not anticipate providing sponsorship for employment visa status (e.g., H-1B, OPT, TN status) How You Will Succeed: Deliver scalable solutions by designing robust data pipelines and architectures that meet performance and reliability standards. Collaborate effectively with cross-functional teams to turn business needs into technical outcomes. Lead with expertise , mentoring peers and driving adoption of best practices in data engineering and cloud technologies. Continuously improve systems through automation, performance tuning, and proactive issue resolution. Communicate with clarity to ensure alignment across technical and non-technical stakeholders.

Requirements

  • Bachelor’s degree
  • At least 2 years of experience in data engineering using core technologies such as SQL, Python, PySpark, and AWS services including Lambda, Glue, S3, Redshift, Athena, and IAM roles/policies.
  • 1+ years of experience working in Agile environments, with hands-on experience using GitHub and CI/CD pipelines for code deployment.
  • 1+ years of experience with orchestration tools like Airflow for workflow automation.
  • Proven experience in architecting and building high-performance, scalable data pipelines following Data Lakehouse, Data Warehouse, and Data Mart standards.
  • Strong expertise in data modelling ( both OLTP and OLAP ), managing large datasets, and implementing secure, compliant data governance practices.
  • Hands-on experience with Databricks, including cluster management, workspace configuration, notebook development, and performance optimization.
  • Qualified candidates must be legally authorized to be employed in the United States.
  • Lilly does not anticipate providing sponsorship for employment visa status (e.g., H-1B, OPT, TN status)
  • Strong proficiency in SQL and Python.
  • Hands-on experience with cloud platforms (AWS, Azure, or GCP) and tools like Glue, EMR, Redshift, Lambda, or Databricks.
  • Deep understanding of ETL/ELT workflows , data modelling, and data warehousing concepts.
  • Familiarity with big data and streaming frameworks (e.g., Apache Spark, Kafka, Flink).
  • Knowledge of data governance, security, and quality practices .
  • Working knowledge of Databricks for building and optimizing scalable data pipelines and analytics workflows.
  • Experience with CI/CD, version control (Git) , and infrastructure-as-code tools is a plus.
  • A problem-solving mindset , attention to detail, and a passion for clean, maintainable code.
  • Strong communication and collaboration skills to work with both technical and non-technical stakeholders.

Nice To Haves

  • Domain experience in healthcare, pharmaceutical ( Customer Master, Product Master, Alignment Master, Activity, Consent etc. ), or regulated industries is a plus.
  • Partner with and influence vendor resources on solution development to ensure understanding of data and technical direction for solutions as well as delivery
  • AWS Certified Data Engineer
  • Databricks Certified Data Engineer (Associate)
  • Familiarity with AI/ML workflows and integrating machine learning models into data pipelines
  • Ability to collaborate with business stakeholders to translate key business requirements into scalable technical solutions.
  • Familiarity with security models and developing solutions on large-scale, distributed data systems.

Responsibilities

  • Design, build, and maintain scalable and reliable data pipelines for batch and real-time processing.
  • Own incident response and resolution , including root cause analysis and post-mortem reporting for data failures and performance issues.
  • Develop and optimize data models, ETL/ELT workflows, and data integration across multiple systems and platforms.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver solutions.
  • Implement data governance, security, and quality standards across data assets.
  • Lead end-to-end data engineering projects and contribute to architectural decisions.
  • Design and implement cloud-native solutions on AWS (preferred) using tools such as AWS Glue, EMR, and Databricks. Experience with Azure or GCP is a plus.
  • Promote best practices in coding, testing, and deployment.
  • Monitor, troubleshoot, and improve performance and reliability of data infrastructure.
  • Automate manual processes and identify opportunities to optimize data workflows and reduce costs.

Benefits

  • Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance).
  • In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service