Databricks Lead

INDEX ANALYTICS LLC
22h$166,250 - $220,500Remote

About The Position

Index Analytics, LLC, is a rapidly growing, Baltimore-based small business providing health-related consulting services to the federal government. At the center of our company culture is a commitment to instilling a dynamic and employee-friendly place to work. We place a priority on promoting a supportive and collegial team environment and enhancing staff experience through career development and educational opportunities. Position Overview The incumbent will lead the development and implementation of Databricks pipelines across multiple environments with access from the Client's AWS Government Cloud. As part of a cross-functional project team that includes data engineers, database architects, analysts and analytic SMEs, the Databricks Lead will implement a scalable Databricks environment capable of ingesting diverse data sources to support advanced data analytics.

Requirements

  • U.S. citizen or otherwise authorized to work in the United States and able to demonstrate physical residency in the U.S. for at least three (3) of the past five (5) years.
  • Bachelor’s degree or higher in computer science or relevant discipline with eight (8) or more years of professional work experience, including six (6) or more years of data engineering experience designing and building pipelines across a variety of database technologies.
  • Three (3) or more years of hands‑on experience with the Databricks Platform and developer tools including performance tuning, use of Databricks CLI and REST APIs, and cost optimization for Databricks compute and storage.
  • Two (2) or more years of hands‑on programming experience with SAS, Python, JavaScript or similar tools.
  • Working experience with Databricks integrations, including Delta Lake, LakeFlow and Lakehouse Federation.
  • Proven experience with Databricks Unity Catalog and related governance/security frameworks as well as working knowledge of database security, including Access Control Lists (ACLs), Unity Catalog permissions, and RBAC controls.
  • Proficiency with cloud platforms such as AWS or Azure, including database deployment automation and CI/CD pipelines in an AWS environment.

Nice To Haves

  • Databricks certified Data Engineer certifications are a plus.
  • Experience working with unstructured and semi‑structured data (XML, JSON, Parquet, video, audio, images, PDF) is a plus.
  • Strong communication skills and commitment to continuous learning.

Responsibilities

  • Architect, develop and implement data models and data engineering pipelines to support ingestion of diverse CMS data sources, applying complex business rules derived from Python/R code reviews and enable analytics through optimized Apache Spark datasets while maintaining strong security and governance practices.
  • Engineer and manage end‑to‑end data processing workflows including receiving data files in multiple formats, applying validated business logic, loading transformed data into production pipelines, delivering recurring and ad‑hoc data extracts to meet stakeholder and program needs.
  • Support and administer the Databricks environment including user and access control management for clusters, jobs, notebooks, DDL deployments, optimizing performance, monitoring usage and delivery of cost and utilization reporting.
  • Ensure data quality and maintain documentation conducting comprehensive QA validation on all ingested and delivered datasets, managing codebooks and user training guides as well as maintaining complete metadata, source‑to‑target mappings and technical artifacts according to established schedules.

Benefits

  • health and retirement benefits
  • discretionary bonuses
  • reimbursement for professional development opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service