Senior Data Engineer - EMEA
Chainalysis
·
Posted:
May 2, 2023
·
Remote
About the position
The Senior Data Engineer, Investments, at Chainalysis will be responsible for building and maintaining data pipelines for the Market Intelligence product. This includes developing an infrastructure of data-intensive pipelines that run with low latency, choosing the technology that powers them, and collaborating closely with data scientists. The successful candidate will have 5+ years of experience in data engineering, strong knowledge of data modeling and architecture, and proficiency in programming languages such as Python and SQL. They will also have experience with Agile and working in a collaborative environment with cross-functional teams.
Responsibilities
- Build and maintain data pipelines for the Market Intelligence product
- Develop an infrastructure of data intensive pipelines that run with low latency
- Choose the technology that powers the pipelines and collaborate closely with data scientists
- Write and maintain ETLs and their orchestration to produce meaningful and timely insights for customers
- Lead projects as senior engineer and help customers understand the market they are in
- Work with other engineering teams to understand their data lifecycle and develop the new iteration of the data engineering stack and data infrastructure
- Develop and handle scalable data pipelines and build out new integrations with internal and external data sources
- Maintain optimal data pipeline architecture and propose improvements to the existing architecture
- Create scalable, self-healing, and robust data pipelines with low latency
- Implement observability and monitoring tools to ensure pipeline health, data quality, and timeliness of data
- Have strong knowledge of data modeling, data architecture, and data governance
- Be proficient in programming languages such as Python and SQL
- Be familiar with big data storage technologies such as Hadoop Distributed File System (HDFS), Amazon S3, or Azure Blob Storage, table formats like Iceberg and Delta, and file formats like Parquet and Avro
- Have a strong understanding of data security and privacy issues and experience implementing data security measures
- Work with Agile and cross-functional teams, collaborating with data analysts, data scientists, and other stakeholders to understand their data needs and design pipelines to meet those needs
- Have experience with DevOps methodologies, taking ownership of the CI/CD pipelines and using tools such as Github Actions, CircleCI, etc.
- Have excellent communication and presentation skills to communicate with technical and non-technical stakeholders
- Mentor junior data engineers and participate in knowledge sharing sessions with other teams
- Be proactive and try out new solutions, asking for forgiveness, not permission
- Be curious about cryptocurrencies/decentralized-finance or have a desire to learn
Requirements
- 5+ years of experience in data engineering, with a focus on designing and implementing data pipelines using orchestration tools like Airflow, Dagster, Prefect.
- Strong experience with Big Data Processing Tools like Databricks, Dremio, Fivetran, Snowflake, dbt, EMR, Athena, Glue and Presto.
- Strong experience with cloud service providers like AWS, GCP or Azure and infrastructure management using Terraform or alternatives such as AWS CloudFormation.
- Experience with implementing observability and monitoring tools such as Humio and Datadog to ensure pipeline health, data quality, and timeliness of data.
- Strong knowledge of data modeling, data architecture, and data governance, like Data Mesh, Data Vault, Star Schema, Kimball and Inmon.
- Proficiency in programming languages such as Python, and SQL.
- Familiarity with big data storage technologies such as Hadoop Distributed File System (HDFS), Amazon S3, or Azure Blob Storage, table formats like Iceberg and Delta and file formats like Parquet and Avro.
- Strong understanding of data security and privacy issues and experience implementing data security measures.
- Experience with Agile and working in a collaborative environment with cross-functional teams. Collaborate with data analysts, data scientists, and other stakeholders to understand their data needs and design pipelines to meet those needs.
- Experience working with DevOps methodologies, taking ownership of the CI/CD pipelines and experience using tools such as Github Actions, CircleCI etc.
- Excellent communication and presentation skills to communicate with technical and non-technical stakeholders.
- Experience with mentoring junior data engineers and participating in knowledge sharing sessions with other teams.
- Eagerness to be proactive and try out new solutions. Ask for forgiveness, not permission.
- Curious about cryptocurrencies/decentralized-finance or a desire to learn - we can help!