Databricks Data Engineer

GentivaVinings, GA
4h

About The Position

The Databricks Data Engineer will be part of the Data Services team and help transform the delivery of data driven insights at scale. In this role, they will design and engineer robust data pipelines using technologies like Databricks, Azure Data Factory, Apache Spark, and Delta Lake. This role will work hands on crafting healthcare data solutions - processing massive healthcare datasets, optimizing performance, and ensuring our data is accurate, secure, and accessible when it matters most.

Requirements

  • Excellent problem-solving and analytical skills
  • Strong oral and written communication abilities
  • Self-motivated with ability to adapt to new technologies quickly
  • Team player with ability to work independently
  • Detail-oriented with strong organizational skills
  • Ability to manage multiple priorities and meet deadlines
  • Experience communicating technical concepts to non-technical stakeholders
  • Expert-level knowledge of Databricks Workspace, clusters, and notebooks
  • Delta Lake implementation and optimization
  • Unity Catalog for data governance and cataloging
  • Databricks SQL and SQL Analytics
  • Databricks Workflows, Delta Live Tables, and job orchestration
  • Delta Live Tables (DLT) for pipeline orchestration and data quality
  • Advanced Python programming (PySpark, pandas, NumPy)
  • Advanced SQL (query optimization, performance tuning)
  • Git version control and collaborative development
  • Azure Databricks
  • Cloud storage services (ADLS Gen2, Azure Blob Storage)
  • Azure Data Factory for pipeline orchestration and integration
  • Experience designing and managing Azure Data Factory pipelines, triggers, and linked services
  • Infrastructure as Code (Terraform)
  • Experience with BI tools (Power BI, SSRS)
  • Data warehousing and data modeling concepts
  • SQL Server, including SSIS (Integration Services)
  • Bachelor's degree in Computer Science, Information Technology or related field
  • 5+ years of progressive experience in data engineering, analytics, or software development
  • 3+ years of hands-on experience with Databricks platform
  • Strong experience with Apache Spark and PySpark

Nice To Haves

  • Scala programming (preferred)
  • MLflow for ML lifecycle management (plus)
  • Experience with complex data modeling including dimensional modeling, star/snowflake schemas
  • Experience with medallion architecture (bronze/silver/gold layers)
  • Data quality and validation framework implementation
  • CI/CD pipeline development for data workflows (Azure DevOps)
  • Performance tuning and cost optimization
  • DataOps and DevOps practices
  • Healthcare IT or healthcare data experience preferred
  • Databricks Certified Data Engineer Associate (strongly preferred)
  • Databricks Certified Data Engineer Professional
  • Databricks Lakehouse Fundamentals
  • Azure Data Engineer Associate (DP-203)
  • Apache Spark certifications

Responsibilities

  • Translate business requirements into technical specifications and document solution designs, data flows and architecture
  • Design, develop, and maintain ETL/ELT pipelines using Azure Data Factory, Databricks and Apache Spark
  • Implement Delta Lake architecture for reliable data storage and processing
  • Build and optimize data workflows using Databricks Workflows and Jobs
  • Develop scalable data models following medallion architecture (bronze, silver, gold layers)
  • Implement Unity Catalog for data governance, access control, and metadata management
  • Create and maintain Databricks notebooks for data transformation and analysis
  • Optimize Spark jobs for performance and cost efficiency
  • Implement data quality checks and validation frameworks
  • Collaborate with BI developers, data analysts, and data scientists
  • Design and implement data orchestration workflows using Azure Data Factory to coordinate complex ETL/ELT processes
  • Develop and maintain CI/CD pipelines for data workflows
  • Monitor data pipeline performance and troubleshoot issues
  • Document data processes, architectures, and best practices
  • Ensure compliance with data security and privacy regulations
  • Provide support for new and existing solutions

Benefits

  • Comprehensive Benefits Package: Health Insurance, 401k Plan, Tuition Reimbursement, PTO
  • Opportunity to participate in a Fleet Program
  • Competitive Salaries
  • Mileage Reimbursement
  • Professional growth and development opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service