Senior Data Engineer

VeracyteSan Diego, CA
14h$160,000 - $187,000Hybrid

About The Position

The Senior Data Engineer will contribute to Veracyte’s success by designing, developing, and maintaining scalable cloud data infrastructure and pipelines to support the company’s data engineering needs. This role involves hands-on work with data lakes, meshes, and catalogs, collaborating with cross-functional teams in a Scrum environment to deliver high-quality data solutions. The Senior Data Engineer will support the implementation of data management frameworks and align with Veracyte’s global data strategy, policies, and digital transformation initiatives, including the Veracyte Lakehouse built on AWS and Snowflake.

Requirements

  • Bachelor’s or Master’s degree in Engineering, Computer Science, or a related field.
  • 5+ years of experience (BS) or 3+ years (MS) in data engineering or a similar role.
  • Hands-on experience with designing and deploying data pipelines in cloud environments, preferably AWS and/or GCP.
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Experience with AWS services (S3, Glue, Lake Formation, SageMaker) and Snowflake for data warehousing, ELT processes, and data modeling.
  • Familiarity with data cataloging tools, data lakes, and governance best practices.
  • Strong problem-solving and analytical skills.
  • Excellent communication and collaboration abilities to work effectively in a cross-functional Scrum team.
  • Ability to thrive in a fast-paced, dynamic environment focused on data as a service (DaaS).

Nice To Haves

  • Knowledge of open formats like Apache Parquet and Iceberg is a plus.

Responsibilities

  • Design and Develop Data Infrastructure : Build and maintain scalable, efficient data pipelines and infrastructure for Lakehouse systems, including bronze, silver, and gold data layers. Work with technologies such as Amazon S3, Snowflake, AWS Glue, Lake Formation, and SageMaker for data storage, processing, and analytics.
  • Collaborate Across Teams : Partner with the Technical Program Manager (TPM), data scientists, and stakeholders to understand business requirements and translate them into technical data solutions. Participate in Scrum processes, including backlog grooming, sprint planning, and handling data set requests via Jira.
  • Optimize and Secure Data : Optimize data retrieval, processing, and ELT workflows for improved performance and reliability. Implement data security measures, governance policies, and compliance with PHI, consent, and regulatory requirements.
  • Support Data Management Initiatives : Assist in identifying and assessing internal and external data sources for the data catalog. Contribute to the evaluation, development, or integration of user-friendly data catalog applications aligned with best practices. Help provide training and support to users of the data catalog.
  • Contribute to Data Strategy : Provide technical input to support the development and implementation of Veracyte’s data strategy and policies. Collaborate on defining user stories, data quality levels (e.g., Medallion architecture), and access controls for datasets. Support data acquisition, curation, and delivery for use cases like AI model training, clinical decision support, and operational efficiency.
  • Mentorship and Knowledge Sharing : Mentor junior data engineers and foster a culture of continuous learning. Share expertise in data engineering best practices, emerging technologies, and tools like Apache Parquet, Iceberg, and Zero-ETL integrations.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service