Thermo Fisher Scientific - Waltham, MA

posted about 1 month ago

Full-time - Mid Level
Remote - Waltham, MA
Computer and Electronic Product Manufacturing

About the position

The Scientist III, Data Sciences role at Thermo Fisher Scientific involves developing scalable data pipelines and API integrations to manage increasing data volume and complexity. This position requires ownership of projects related to data platform solutions, utilizing various data engineering technologies and cloud platforms. The role emphasizes agile methodologies and operational excellence in delivering data solutions while ensuring data security and performance optimization.

Responsibilities

  • Develop scalable data pipelines and build new API integrations to support increasing data volume and complexity.
  • Own and deliver projects/enhancements associated with data platform solutions.
  • Develop solutions using Pyspark/EMR, SQL, and databases, including AWS Athena, S3, Redshift, AWS API Gateway, Lambda, and Glue.
  • Implement solutions using AWS and other cloud platform tools, including GitHub, Jenkins, Terraform, Jira, and Confluence.
  • Follow agile development methodologies to deliver solutions and product features by adhering to DevOps, Data Ops, and Dev Sec Ops practices.
  • Propose data load optimizations and continuously implement improvements to data load performance.
  • Identify, design, and implement internal process improvements, including automating manual processes and optimizing data delivery.
  • Ensure data security across multiple data centers and AWS regions.
  • Participate in on-call schedule to address critical operational incidents and business requests.
  • Meet and exceed BI operational SLAs for various metrics.

Requirements

  • Master's degree in Computer Science, Mechanical Engineering, or related field of study with 1 year of IT experience as a Data Engineer, or related experience.
  • Alternatively, a Bachelor's degree in Computer Science, Mechanical Engineering, or related field of study plus 3 years of IT experience as a Data Engineer, or related experience.
  • Knowledge or experience with Informatica Power center 10.4, Oracle R12, TOAD, SFDC, Data Warehouse Administration, UNIX, Win SCP, Windows 7, Linux, Informatica Power Exchange, Oracle 11g, and Flat Files.
  • Proficiency in SQL, PL/SQL, No SQL, and Big Data.
  • Experience with Databricks, Data/Delta lake, Oracle, or AWS Redshift type relational databases.
  • Experience in Databricks/Spark-based Data Engineering Pipeline development.
  • Proficiency in Python-based data integration and pipeline development.
  • Experience with Data lake and Delta lake using AWS Glue and Athena.
  • Knowledge of AWS Cloud data integration with Apache Spark, Glue, Kafka, Elastic Search, Lambda, S3, Redshift, RDS, MongoDB/DynamoDB ecosystems.
  • Proficiency in Python development, including in PySpark in AWS Cloud environment.
  • Analytical experience with databases, including writing complex queries, query optimization, debugging, user-defined functions, views, and indexes.
  • Familiarity with source control systems, including Git and Jenkins build and continuous integration tools.
  • Understanding of development methodologies and writing functional and technical design specifications.
  • Ability to resolve complex data integration problems.
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service