Data Engineer II

Hagerty

2d•Remote

About The Position

As a Data Engineer II, you will be joining a fast-paced, high functioning team to build and maintain data pipelines and services that support Hagerty’s Enterprise Data Hub (EDH). Internally engineered and developed, the EDH includes data processing & storage, services and APIs. In this role you will develop data pipelines, services, and cloud-based infrastructure to support the growth of Hagerty’s Insurance business and automotive lifestyle brand. You will be partnering with a team of talented engineers working in an agile environment, leveraging modern cloud-based technologies to drive data-driven decision making in analytics and Hagerty’s data products. Ready to get in the driver’s seat? Join us!

Requirements

You have strong problem-solving abilities and attention to detail
You can authentically and effectively communicate (written and verbally) with various stakeholders
You create and share technical artifacts and documentation to support development and maintenance of data products
You have experience in successful delivery of data products as productionizable software solutions
You ensure quality through rigorous code development, testing, automation, and other software engineering best practices
You have experience developing solutions using Python and cloud-based infrastructure (AWS, Azure, or GCP)
You demonstrate experience in imperative (e.g., Apache Airflow / NiFi) or declarative (e.g., Informatica/Talend/Pentaho) ETL design, implementation, and maintenance
You have functional knowledge of relational databases and query authoring (SQL)

Nice To Haves

Associates degree, preferably in a technical/analytical field, or relevant work experience
Additional 3+ years working in another role within an IT delivery team, such as a developer, engineer, data analyst, quality assurance analyst, ETL developer or DBA
Developing infrastructure as code in a cloud-based environment (Terraform experience preferred)
Experience cataloging and processing non-relational data
Experience with open-source data processing technologies such as streaming services (Kafka / SQS), big data processing frameworks (MapReduce/Spark), big data file stores (EMRFS / HDFS)
Experience working with evaluating different data containers based on workload needs (JSON, delimited files, Avro, Parquet)
Experience with container-based development

Responsibilities

Implement best practices around software development and big data engineering
Develop and implement robust and scalable data pipelines using Python, SQL, parallel processing frameworks, and other AWS/Salesforce cloud solutions
Develop and implement batch data pipelines using tools such as Apache Airflow, Snowflake, and numerous AWS products (EC2, Fargate, ECS, Lambda, and RDS)
Develop streaming data integrations to support products across the Hagerty portfolio and support real-time reporting
Develop Enterprise Data Hub platform infrastructure using Terraform infrastructure-as-code
Develop and support Hagerty’s cloud-based data warehouse to enable analytics and product reporting
Partner with internal and external stakeholders to collect requirements, recommend best practice solutions, and productionize new data ingestions/analytic workloads
Develop solutions to catalog and manage metadata to support data governance and data democratization
Partner with Data Quality Engineers to define and implement automated test cases and data reconciliation to validate ETL processes and data quality & integrity
Mentor junior team members in software and big data engineering best practices
Partner with Data Scientists to design, code, train, test, deploy and iterate machine learning algorithms and systems at scale