Scientist III, Data Engineer

Thermo Fisher ScientificWaltham, MA
$161,803 - $186,000Remote

About The Position

COMPANY: Thermo Fisher Scientific Inc. LOCATION: 168 Third Ave., Waltham, MA 02451 TITLE: Scientist III, Data Engineer HOURS: Monday to Friday, 8:00 am to 5:00 pm DUTIES: Develop scalable data pipelines and build out new API integrations to support continuing increases in data volume and complexity. Own and deliver Projects and Enhancements associated with Data platform solutions. Develop solutions using PySpark/EMR, SQL and databases, AWS Athena, S3, Redshift, AWS APIT Gateway, Lambda, Glue, and other Data Engineering technologies. Write Complex Queries and edit them as required for implementing ETL/Data solutions. Implement solutions using AWS and other cloud platform tools, including GitHub, Jenkins, Terraform, Jira, and Confluence. Follow agile development methodologies to deliver solutions and product features by following DevOps, Data Ops, and Dev Sec Ops practices. Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability. Travel: Up to 5% travel required (domestic and international). Can work remotely or telecommute. REQUIREMENTS: MINIMUM Education Requirement: Master’s degree or foreign degree equivalent in Technology Management, Information Technology, Computer Science, or related field of study. MINIMUM Experience Requirement: 3 years of experience as a Data Developer, Data Engineer, or related occupation. Alternative Education and Experience Requirement: Bachelor’s degree or foreign degree equivalent in Technology Management, Information Technology, Computer Science, or related field of study plus 5 years of experience as a Data Developer, Data Engineer, or related occupation. Required knowledge or experience with: Full life cycle implementation in AWS using PySpark/EMR, Athena, S3, Redshift, AWS API Gateway, Lambda, and Glue; Agile development methodologies following DevOps, Data Ops, and Dev Sec Ops practices; ETL Pipelines, GitHub, Jenkins, Terraform, Jira, Bitbucket, and Confluence; Informatica, Databricks, & AWS Glue; Data Lake using AWS Databricks, Apache Spark, & Python; Data visualization tools like PowerBI and Tableau; Data modeling and optimization for OLAP/OLTP systems with Star/Snowflake schemas; Strong knowledge of SQL, query optimization, and performance tuning in Redshift, Snowflake, or Oracle; Experience with CI/CD pipelines for data workflows using Jenkins, GitHub Actions, or AWS CodePipeline; Data governance, cataloging, and lineage using tools such as AWS Glue Data Catalog, Collibra, or Alation; Implementing data security, encryption, IAM policies, and compliance regulatory frameworks; Batch and real-time streaming pipelines using Kafka and Spark Streaming; Managing data governance, access control, and lineage using Databricks Unity Catalog for secure enterprise data sharing; Implementing Delta Lake architecture for ACID transactions, schema enforcement, and scalable data pipelines; Optimizing Delta Live Tables for automated ETL orchestration and reliable data delivery; Ensuring high availability and SLA-driven production support with proactive monitoring, incident management, and root cause analysis; Collaboration with cross-functional teams to translate scientific, laboratory, and business requirements into scalable data solutions.

Requirements

  • Master’s degree or foreign degree equivalent in Technology Management, Information Technology, Computer Science, or related field of study.
  • 3 years of experience as a Data Developer, Data Engineer, or related occupation.
  • Full life cycle implementation in AWS using PySpark/EMR, Athena, S3, Redshift, AWS API Gateway, Lambda, and Glue
  • Agile development methodologies following DevOps, Data Ops, and Dev Sec Ops practices
  • ETL Pipelines, GitHub, Jenkins, Terraform, Jira, Bitbucket, and Confluence
  • Informatica, Databricks, & AWS Glue
  • Data Lake using AWS Databricks, Apache Spark, & Python
  • Data visualization tools like PowerBI and Tableau
  • Data modeling and optimization for OLAP/OLTP systems with Star/Snowflake schemas
  • Strong knowledge of SQL, query optimization, and performance tuning in Redshift, Snowflake, or Oracle
  • Experience with CI/CD pipelines for data workflows using Jenkins, GitHub Actions, or AWS CodePipeline
  • Data governance, cataloging, and lineage using tools such as AWS Glue Data Catalog, Collibra, or Alation
  • Implementing data security, encryption, IAM policies, and compliance regulatory frameworks
  • Batch and real-time streaming pipelines using Kafka and Spark Streaming
  • Managing data governance, access control, and lineage using Databricks Unity Catalog for secure enterprise data sharing
  • Implementing Delta Lake architecture for ACID transactions, schema enforcement, and scalable data pipelines
  • Optimizing Delta Live Tables for automated ETL orchestration and reliable data delivery
  • Ensuring high availability and SLA-driven production support with proactive monitoring, incident management, and root cause analysis
  • Collaboration with cross-functional teams to translate scientific, laboratory, and business requirements into scalable data solutions.

Responsibilities

  • Develop scalable data pipelines and build out new API integrations to support continuing increases in data volume and complexity.
  • Own and deliver Projects and Enhancements associated with Data platform solutions.
  • Develop solutions using PySpark/EMR, SQL and databases, AWS Athena, S3, Redshift, AWS APIT Gateway, Lambda, Glue, and other Data Engineering technologies.
  • Write Complex Queries and edit them as required for implementing ETL/Data solutions.
  • Implement solutions using AWS and other cloud platform tools, including GitHub, Jenkins, Terraform, Jira, and Confluence.
  • Follow agile development methodologies to deliver solutions and product features by following DevOps, Data Ops, and Dev Sec Ops practices.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability.

Benefits

  • A choice of national medical and dental plans, and a national vision plan, including health incentive programs
  • Employee assistance and family support programs, including commuter benefits and tuition reimbursement
  • At least 120 hours paid time off (PTO), 10 paid holidays annually, paid parental leave (3 weeks for bonding and 8 weeks for caregiver leave), accident and life insurance, and short- and long-term disability in accordance with company policy
  • Retirement and savings programs, such as our competitive 401(k) U.S. retirement savings plan
  • Employees’ Stock Purchase Plan (ESPP) offers eligible colleagues the opportunity to purchase company stock at a discount
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service