Lead Data Engineer

RE Partners
4d$100,000 - $250,000

About The Position

What You’ll Do Design and build Spark data ETL pipelines on AWS data platform. Collaborate with cross functional teams such as data scientists, fraud, marketing and other business stakeholders to understand their data needs and deliver reliable solutions. Optimize data infrastructure - Design and maintain robust data infrastructure by using modern data platform architecture. Ensure data quality and reliability. Innovate and follow best practices. Ensure operational excellence of the data platform, including monitoring, incident response, performance optimization, and continuous improvement. Who We’re Looking For (“Must Haves”) Professional experience working in data warehousing, data architecture, and/or data engineering environments, especially using spark, hadoop, hive etc with solid understanding of streaming pipelines. Proficiency in at least one high-level programming language (Scala, Java, Python or equivalent) Good understanding of databases You have built large-scale data products and understand the tradeoffs made when building these features You have a deep understanding of system design, data structures, and algorithms You have an excellent knowledge of distributed computing frameworks such as Hadoop MapReduce, Spark You have a strong knowledge of following AWS infrastructure - EMR, S3, Lambda, Redshift etc You have strong understanding of data quality, governance You are a team player, self-driven, highly motivated individual who loves to learn new things

Requirements

  • Professional experience working in data warehousing, data architecture, and/or data engineering environments, especially using spark, hadoop, hive etc with solid understanding of streaming pipelines.
  • Proficiency in at least one high-level programming language (Scala, Java, Python or equivalent)
  • Good understanding of databases
  • You have built large-scale data products and understand the tradeoffs made when building these features
  • You have a deep understanding of system design, data structures, and algorithms
  • You have an excellent knowledge of distributed computing frameworks such as Hadoop MapReduce, Spark
  • You have a strong knowledge of following AWS infrastructure - EMR, S3, Lambda, Redshift etc
  • You have strong understanding of data quality, governance
  • You are a team player, self-driven, highly motivated individual who loves to learn new things

Responsibilities

  • Design and build Spark data ETL pipelines on AWS data platform.
  • Collaborate with cross functional teams such as data scientists, fraud, marketing and other business stakeholders to understand their data needs and deliver reliable solutions.
  • Optimize data infrastructure - Design and maintain robust data infrastructure by using modern data platform architecture.
  • Ensure data quality and reliability.
  • Innovate and follow best practices.
  • Ensure operational excellence of the data platform, including monitoring, incident response, performance optimization, and continuous improvement.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service