Data Lead Engineer Only W2 Candidates

Mega Cloud Lab•Fremont, CA

16h•Onsite

About The Position

This role is for a Data Lead Engineer with a focus on designing and modernizing enterprise data platforms. The ideal candidate will have extensive experience in architecting scalable ETL/ELT pipelines, leading enterprise-scale Data Architecture initiatives, and engineering large-scale distributed processing workloads. Experience with cloud-native platforms on AWS and Azure is essential, as is a strong understanding of data modeling, governance, and real-time streaming architectures.

Requirements

Azure Data Factory (ADF)
Azure Databricks & PySpark
Azure Synapse
Azure SQL
Python
Spark SQL
12+ years of experience designing and modernizing enterprise data platforms
Experience architecting scalable ETL/ELT pipelines using Python, PySpark, Databricks, AWS Glue and Azure Data Factory
Experience leading enterprise-scale Data Architecture initiatives, defining logical and physical data models, governance standards and cloud-native platform blueprints across AWS and Azure environments.
Experience designing and implementing Medallion (Bronze/Silver/Gold) Lakehouse architectures using Delta Lake, S3, ADLS Gen2, Snowflake, Redshift and Synapse Analytics.
Experience engineering large-scale distributed processing workloads using Apache Spark, PySpark, Databricks, EMR, Hive and HDFS.
Experience orchestrating complex data workflows using Apache Airflow, Databricks Workflows, AWS Step Functions and Azure Data Factory triggers.
Strong hands-on experience in Advanced SQL, including complex joins, CTEs, window functions, stored procedures, indexing strategies, partitioning and execution plan optimization across Snowflake, PostgreSQL and Oracle.
Experience building real-time streaming architectures using Apache Kafka, AWS Kinesis, Azure Event Hub and Service Bus.

Responsibilities

Designing and modernizing enterprise data platforms across banking, healthcare and retail domains.
Architecting scalable ETL/ELT pipelines using Python, PySpark, Databricks, AWS Glue and Azure Data Factory.
Leading enterprise-scale Data Architecture initiatives, defining logical and physical data models, governance standards and cloud-native platform blueprints across AWS and Azure environments.
Designing and implementing Medallion (Bronze/Silver/Gold) Lakehouse architectures using Delta Lake, S3, ADLS Gen2, Snowflake, Redshift and Synapse Analytics.
Engineering large-scale distributed processing workloads using Apache Spark, PySpark, Databricks, EMR, Hive and HDFS, processing billions of records for enterprise analytics.
Orchestrating complex data workflows using Apache Airflow, Databricks Workflows, AWS Step Functions and Azure Data Factory triggers, ensuring SLA-driven pipeline execution.
Building real-time streaming architectures using Apache Kafka, AWS Kinesis, Azure Event Hub and Service Bus, supporting fraud detection, claims monitoring and operational telemetry.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume