Databricks Platform Architect

KoantekChandler, AZ
Remote

About The Position

We are seeking an accomplished, technology-driven Lead Data Platform Architect / Migration Specialist to spearhead the modernization of our core enterprise financial and tax allocation engines. In this role, you will lead the architectural design, definition of migration strategies, and hands-on implementation to transition large-scale legacy relational database systems (SQL Server/T-SQL) into a modern, cloud-native Databricks Lakehouse platform. The ideal candidate will have extensive experience in high-throughput distributed systems, Databricks compute optimization, performance tuning, and complex pipeline orchestration.

Requirements

  • Deep expert-level knowledge of Databricks (Lakehouse architecture, Delta Lake, Unity Catalog) and Apache Spark / PySpark.
  • Strong background in relational databases, with advanced proficiency in SQL Server, T-SQL, and Stored Procedures. Ability to reverse-engineer and refactor legacy database logic into distributed paradigms.
  • Hands-on experience with Apache Airflow or similar modern workflow orchestrators.
  • Proven track record in cost optimization (FinOps), cluster tuning, autoscaling configurations, and handling skewed data profiles.
  • Experience with Infrastructure as Code (Terraform), data build tool (dbt), testing frameworks (PyTest), and automated Git-based workflows.
  • 10+ years of experience in Data Engineering/Architecture, with at least 3+ years specifically leading large-scale cloud data migrations.
  • Bachelor’s or Master's degree in Computer Science, Engineering, or a related technical field.

Nice To Haves

  • Databricks Certified Data Engineer Associate / Professional
  • Databricks Certified Solutions Architect
  • AWS Certified Database Specialist or equivalent Cloud Certifications

Responsibilities

  • Validate, refine, and own the target architecture on Databricks. Define robust migration strategies and production-ready reference patterns to convert 150+ complex stored procedures into PySpark and Structured/Declarative Pipelines (SDP).
  • Design distributed processing frameworks, control flows, and configuration-driven parameter handling for both full and incremental recalculation modes.
  • Address performance deltas between small and large workloads. Architect and implement acceleration techniques such as caching, partition pruning, cluster sizing, and offline/pre-calculation strategies to maintain sub-30-second user-facing reporting SLAs.
  • Design and deploy enterprise-level pipeline orchestration using tools like Apache Airflow or Databricks Workflows. Integrate robust logging, error handling, and observability patterns into existing enterprise monitoring frameworks.
  • Implement data governance models, data lineage, and schema evolution utilizing tools like Unity Catalog.
  • Establish best practices for AI-assisted code generation (e.g., using Claude or advanced LLMs), providing code-review patterns and refactoring frameworks to ensure maintainable and performant output.
  • Lead code walkthroughs, design reviews, and pair-programming sessions with the development team to accelerate knowledge transfer and technical excellence.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service