Senior Data Engineer – Real-Time Streaming

Orbital Insight

54d•Remote

About The Position

We are seeking a Senior Data Engineer to design and build high-performance real-time data platforms that power analytics, machine learning, and operational intelligence. This role focuses on streaming data pipelines, distributed processing, and large-scale event data systems. You will work on building and operating low-latency data pipelines using technologies such as Apache Flink, Apache Druid, Kafka, and modern data infrastructure, enabling real-time insights across large volumes of structured and unstructured data. This role requires strong experience in stream processing architectures, distributed systems, and scalable data infrastructure.

Requirements

7+ years of experience in data engineering or distributed systems development
Strong experience building streaming data pipelines
Hands-on experience with at least one major streaming framework
Experience with real-time analytical databases
Experience with large-scale distributed systems
Strong SQL skills and experience designing analytical data models
Experience building fault-tolerant, highly scalable pipelines
Proficiency in one or more programming languages: Java Python
Experience with AWS

Nice To Haves

Experience operating Apache Flink clusters in production
Experience with Apache Druid real-time ingestion
Experience building low-latency OLAP analytics systems
Experience with Kubernetes-based data infrastructure
Experience with Iceberg / Hudi / Delta Lake
Experience with real-time ML feature pipelines
Experience building observability for data platforms
Experience with high-volume event streams (billions of events/day)

Responsibilities

Design and implement real-time streaming data pipelines for high-volume event data.
Develop and operate distributed data processing systems using technologies such as: Apache Flink Apache Kafka Apache Druid
Build scalable ingestion pipelines capable of handling millions of events per second.
Design low-latency analytical data stores for operational dashboards and real-time analytics.
Optimize data pipelines for performance, scalability, and fault tolerance.
Work with product and analytics teams to translate business needs into real-time data models.
Build and maintain data observability, monitoring, and reliability frameworks.
Implement schema evolution and data quality controls across streaming pipelines.
Contribute to data platform architecture decisions and infrastructure design.
Mentor junior engineers and promote best practices in data engineering and distributed systems.