About The Position

We are seeking a Senior Data Engineer to design and build high-performance real-time data platforms that power analytics, machine learning, and operational intelligence. This role focuses on streaming data pipelines, distributed processing, and large-scale event data systems. You will work on building and operating low-latency data pipelines using technologies such as Apache Flink, Apache Druid, Kafka, and modern data infrastructure, enabling real-time insights across large volumes of structured and unstructured data. This role requires strong experience in stream processing architectures, distributed systems, and scalable data infrastructure.

Requirements

  • 7+ years of experience in data engineering or distributed systems development
  • Strong experience building streaming data pipelines
  • Hands-on experience with at least one major streaming framework
  • Experience with real-time analytical databases
  • Experience with large-scale distributed systems
  • Strong SQL skills and experience designing analytical data models
  • Experience building fault-tolerant, highly scalable pipelines
  • Proficiency in one or more programming languages: Java Python
  • Experience with AWS

Nice To Haves

  • Experience operating Apache Flink clusters in production
  • Experience with Apache Druid real-time ingestion
  • Experience building low-latency OLAP analytics systems
  • Experience with Kubernetes-based data infrastructure
  • Experience with Iceberg / Hudi / Delta Lake
  • Experience with real-time ML feature pipelines
  • Experience building observability for data platforms
  • Experience with high-volume event streams (billions of events/day)

Responsibilities

  • Design and implement real-time streaming data pipelines for high-volume event data.
  • Develop and operate distributed data processing systems using technologies such as: Apache Flink Apache Kafka Apache Druid
  • Build scalable ingestion pipelines capable of handling millions of events per second.
  • Design low-latency analytical data stores for operational dashboards and real-time analytics.
  • Optimize data pipelines for performance, scalability, and fault tolerance.
  • Work with product and analytics teams to translate business needs into real-time data models.
  • Build and maintain data observability, monitoring, and reliability frameworks.
  • Implement schema evolution and data quality controls across streaming pipelines.
  • Contribute to data platform architecture decisions and infrastructure design.
  • Mentor junior engineers and promote best practices in data engineering and distributed systems.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service