Data Engineer

FlowcodeNew York, NY
4h$150,000 - $170,000Hybrid

About The Position

We are looking for an experienced Data Engineer to join our dynamic team. In this role, you will design, build, and maintain scalable data pipelines that empower our data-driven decision-making. You will work with a modern tech stack, including Snowflake, Python, AWS, Kafka, Docker, Kubernetes, DBT, and Airflow, to optimize our data infrastructure and ensure efficient, reliable data flow across the organization. In addition to strong technical skills, you will collaborate with cross-functional teams and contribute innovative solutions and best practices to continuously improve our data processes and tools.

Requirements

  • Education: Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or a related field, or equivalent practical experience in data engineering.
  • Experience: 3+ years of experience in data engineering or software development, with a strong background in building data pipelines and working with modern data architectures.
  • Programming: Proficient in Python.
  • Data Platforms: Hands-on experience with Snowflake, Fivetran, DBT, and Amazon DMS.
  • APIs & Streaming: Experience with RESTful/GraphQL APIs and Kafka.
  • Orchestration: Skilled with Apache Airflow, Docker, and Kubernetes.
  • Analytical Skills: Strong problem-solving abilities, with attention to detail and a focus on data quality.
  • Communication: Excellent verbal and written communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.
  • Team Collaboration: Proven experience working collaboratively in agile, cross-functional teams.

Nice To Haves

  • Experience with additional programming languages such as Java or Scala.
  • Knowledge of additional data integration or processing tools.
  • Familiarity with cloud platforms (AWS, Google Cloud, or Azure) and their respective data services.
  • Big Data Processing: Experience processing large datasets using technologies like PySpark.

Responsibilities

  • Data Pipeline Development: Design, develop, and maintain data pipelines for ingesting, processing, and transforming large datasets using Python.
  • Data Warehousing: Leverage Snowflake to build and manage robust data warehouses, ensuring efficient storage and retrieval of data.
  • Data Modeling & Transformations (DBT): Build, test, and maintain modular analytics models using DBT to ensure documented, versioned, and reliable transformations in Snowflake.
  • ETL Processes: Implement ETL workflows using Python, Fivetran and Amazon DMS, ensuring data integrity, quality, and timely delivery.
  • API Integration: Develop and integrate APIs to facilitate seamless data exchange between internal systems and external data sources. Work with RESTful or GraphQL APIs to ensure reliable data ingestion.
  • Streaming Data Ingestion: Implement data ingestion solutions from streaming platforms, including Kafka, to handle real-time data processing.
  • Orchestration & Scheduling: Use Apache Airflow to schedule and monitor data workflows, ensuring consistent and reliable pipeline execution.
  • Containerization & Orchestration: Deploy and manage applications using Docker and Kubernetes to create scalable, containerized solutions.
  • Optimization & Troubleshooting: Continuously monitor, optimize, and troubleshoot data processes and infrastructure for performance improvements and reliability.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service