Data Engineer II

Hagerty
2dRemote

About The Position

As a Data Engineer II, you will be joining a fast-paced, high functioning team to build and maintain data pipelines and services that support Hagerty’s Enterprise Data Hub (EDH). Internally engineered and developed, the EDH includes data processing & storage, services and APIs. In this role you will develop data pipelines, services, and cloud-based infrastructure to support the growth of Hagerty’s Insurance business and automotive lifestyle brand. You will be partnering with a team of talented engineers working in an agile environment, leveraging modern cloud-based technologies to drive data-driven decision making in analytics and Hagerty’s data products. Ready to get in the driver’s seat? Join us!

Requirements

  • You have strong problem-solving abilities and attention to detail
  • You can authentically and effectively communicate (written and verbally) with various stakeholders
  • You create and share technical artifacts and documentation to support development and maintenance of data products
  • You have experience in successful delivery of data products as productionizable software solutions
  • You ensure quality through rigorous code development, testing, automation, and other software engineering best practices
  • You have experience developing solutions using Python and cloud-based infrastructure (AWS, Azure, or GCP)
  • You demonstrate experience in imperative (e.g., Apache Airflow / NiFi) or declarative (e.g., Informatica/Talend/Pentaho) ETL design, implementation, and maintenance
  • You have functional knowledge of relational databases and query authoring (SQL)

Nice To Haves

  • Associates degree, preferably in a technical/analytical field, or relevant work experience
  • Additional 3+ years working in another role within an IT delivery team, such as a developer, engineer, data analyst, quality assurance analyst, ETL developer or DBA
  • Developing infrastructure as code in a cloud-based environment (Terraform experience preferred)
  • Experience cataloging and processing non-relational data
  • Experience with open-source data processing technologies such as streaming services (Kafka / SQS), big data processing frameworks (MapReduce/Spark), big data file stores (EMRFS / HDFS)
  • Experience working with evaluating different data containers based on workload needs (JSON, delimited files, Avro, Parquet)
  • Experience with container-based development

Responsibilities

  • Implement best practices around software development and big data engineering
  • Develop and implement robust and scalable data pipelines using Python, SQL, parallel processing frameworks, and other AWS/Salesforce cloud solutions
  • Develop and implement batch data pipelines using tools such as Apache Airflow, Snowflake, and numerous AWS products (EC2, Fargate, ECS, Lambda, and RDS)
  • Develop streaming data integrations to support products across the Hagerty portfolio and support real-time reporting
  • Develop Enterprise Data Hub platform infrastructure using Terraform infrastructure-as-code
  • Develop and support Hagerty’s cloud-based data warehouse to enable analytics and product reporting
  • Partner with internal and external stakeholders to collect requirements, recommend best practice solutions, and productionize new data ingestions/analytic workloads
  • Develop solutions to catalog and manage metadata to support data governance and data democratization
  • Partner with Data Quality Engineers to define and implement automated test cases and data reconciliation to validate ETL processes and data quality & integrity
  • Mentor junior team members in software and big data engineering best practices
  • Partner with Data Scientists to design, code, train, test, deploy and iterate machine learning algorithms and systems at scale
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service