Data Bricks Data Architect

QodeCalifornia City, CA
3dOnsite

About The Position

We are seeking a Data Bricks Data Architect to support the design, implementation, and optimization of cloud-native data platforms built on the Data bricks Lakehouse Architecture. This is a hands-on, engineering-driven role requiring deep experience with Apache Spark, Delta Lake, and scalable data pipeline development, combined with early-stage architectural responsibilities.The role involves close onsite collaboration with client stakeholders, translating analytical and operational requirements into robust, high-performance data architectures, while adhering to best practices for data modeling, governance, reliability, and cost efficiency.

Requirements

  • deep experience with Apache Spark
  • Delta Lake
  • scalable data pipeline development

Nice To Haves

  • Exposure to Data bricks Unity Catalog, data governance, and access control models
  • Experience with Data bricks Workflows, Apache Airflow, or Azure Data Factory for orchestration
  • Familiarity with streaming frameworks (Spark Structured Streaming, Kafka) and/or CDC patterns
  • Understanding of data quality frameworks, validation checks, and observability concepts
  • Experience integrating Data bricks with BI tools such as Power BI, Tableau, or Looker
  • Awareness of cost optimization strategies in cloud-based data platforms
  • Prior Lifesciences Domain Experience

Responsibilities

  • Design, develop, and maintain batch and near-real-time data pipelines using Databricks, PySpark, and Spark SQL
  • Implement Medallion (Bronze/Silver/Gold) Lakehouse architectures, ensuring proper data quality, lineage, and transformation logic across layers
  • Build and manage Delta Lake tables, including schema evolution, ACID transactions, time travel, and optimized data layouts
  • Apply performance optimization techniques such as partitioning strategies, Z-Ordering, caching, broadcast joins, and Spark execution tuning
  • Support dimensional and analytical data modeling for downstream consumption by BI tools and analytics applications
  • Assist in defining data ingestion patterns (batch, incremental loads, CDC, and streaming where applicable)
  • Troubleshoot and resolve pipeline failures, data quality issues, and Spark job performance bottlenecks.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

11-50 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service