Lead Flink Data Engineer IoT

FusemachinesNew York City, NY
1d

About The Position

This is a role responsible for designing, building, and maintaining the real-time streaming infrastructure required for sensor data integration, complex state processing, and alerting in the Industrial IoT / Smart Buildings domain. We are seeking a Lead Data Engineer with expert-level Apache Flink (Java and SQL) skills, and proven implementation of custom ProcessFunctions, state management, and real-time data quality gates, delivering Data and Analytics products using Agile methodology. The ideal candidate will possess strong technical, analytical, and interpersonal skills.

Requirements

  • 5+ years of hands-on data engineering experience with deep expertise in the Azure ecosystem.
  • 5+ years in Java (backend) with 3+ years deep specialization in Apache Flink (DataStream API & SQL).
  • Experience with Flink State Management, Checkpointing, and Watermark strategies.
  • Experience upgrading to or using Flink 2.0/2.x.
  • Experience with Azure Event Hubs or Kafka.
  • Understanding of "Sensor/IoT" data patterns (out-of-order and missing events).
  • Proficient in Java, SQL, and writing optimized data integration and processing code.
  • Strong understanding of SDLC and Agile methodologies with hands-on experience in Azure DevOps, GitHub, CI/CD, and artifact management.
  • Deep expertise in Azure data services.
  • Skilled in data modeling, database design, and data warehousing solutions on Azure.
  • Knowledge of data quality, governance, and security best practices within Azure (AD, NSG, encryption, compliance).

Nice To Haves

  • Certifications preferred: Azure Fundamentals, Azure Data Engineer Associate, and Azure Solutions Architect Expert (nice to have).

Responsibilities

  • Architect, design, and implement scalable and efficient data solutions on Flink.
  • Design and implement the Flink 2.x streaming pipeline for IoT.
  • Develop custom KeyedProcessFunctions to handle "Heartbeat Injection" (detecting sensor silence) and complex windowing.
  • Implement the Gatekeeper Pattern: A Flink job that applies Data Quality tags (Validity, Completeness) to the stream without dropping data.
  • Optimize State Backends for high-cardinality sensor data (100k+ concurrent sensors).
  • Work with the "Shared Core" team to integrate libraries into the Flink runtime.
  • Manage and optimize Azure resources (Flink clusters) for performance, reliability, and cost-efficiency.
  • Transform, clean, and prepare data using SQL, and Java.
  • Monitor and fine-tune workloads and pipelines for optimal performance and reliability.
  • Maintain clear documentation of solutions, configurations, and workflows.
  • Actively participate in Agile team activities and continuous improvement initiatives.
  • Promote and enforce data engineering best practices, including data governance, security, and data quality.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service