Data Bricks Data Architect

Qode•California City, CA

30d•Onsite

About The Position

We are seeking a Data Bricks Data Architect to support the design, implementation, and optimization of cloud-native data platforms built on the Data bricks Lakehouse Architecture. This is a hands-on, engineering-driven role requiring deep experience with Apache Spark, Delta Lake, and scalable data pipeline development, combined with early-stage architectural responsibilities.The role involves close onsite collaboration with client stakeholders, translating analytical and operational requirements into robust, high-performance data architectures, while adhering to best practices for data modeling, governance, reliability, and cost efficiency.

Requirements

deep experience with Apache Spark
Delta Lake
scalable data pipeline development

Nice To Haves

Exposure to Data bricks Unity Catalog, data governance, and access control models
Experience with Data bricks Workflows, Apache Airflow, or Azure Data Factory for orchestration
Familiarity with streaming frameworks (Spark Structured Streaming, Kafka) and/or CDC patterns
Understanding of data quality frameworks, validation checks, and observability concepts
Experience integrating Data bricks with BI tools such as Power BI, Tableau, or Looker
Awareness of cost optimization strategies in cloud-based data platforms
Prior Lifesciences Domain Experience

Responsibilities

Design, develop, and maintain batch and near-real-time data pipelines using Databricks, PySpark, and Spark SQL
Implement Medallion (Bronze/Silver/Gold) Lakehouse architectures, ensuring proper data quality, lineage, and transformation logic across layers
Build and manage Delta Lake tables, including schema evolution, ACID transactions, time travel, and optimized data layouts
Apply performance optimization techniques such as partitioning strategies, Z-Ordering, caching, broadcast joins, and Spark execution tuning
Support dimensional and analytical data modeling for downstream consumption by BI tools and analytics applications
Assist in defining data ingestion patterns (batch, incremental loads, CDC, and streaming where applicable)
Troubleshoot and resolve pipeline failures, data quality issues, and Spark job performance bottlenecks.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume