Data Scientist - Analytics Engineer (N375)

Heluna HealthLos Angeles, CA
1dHybrid

About The Position

The Los Angeles County Department of Homeless Services and Housing (HSH ) consolidates our countywide response to homelessness . The driving force behind HSH is increasing accountability and transparency, improving care for people experiencing or at risk of homelessness, and streamlining collaboration with partners including services providers, the County’s 88 cities, and unincorporated areas to deliver high-quality, life-saving care. Staff schedules are based on business need and may include the option for a hybrid work schedule where employees work remotely and from the office. The Analytics Engineer plays a key role on the HSH Data Engineering team, building the semantic layer that supports performance tracking, evaluation, and policy guidance. This position translates analytic requirements into curated silver and gold data models using Databricks and supports integration of data from normalized backend systems. This is a chance to architect a scalable data environment from the ground up in a mission-driven context. The engineering team plays a central role in the County’s data strategy, with opportunities for mentorship, innovation, and cross-sector impact.

Requirements

  • Option I: Two (2) years of experience applying advanced statistical analyses, including predictive analytics or data engineering, to produce actionable recommendations to support data-driven program, policy, and operational decision-making, at a level equivalent to the Los Angeles County class of Predictive Data Analyst. Experience at the level of Predictive Data Analyst is defined as using machine learning techniques or data engineering practices to analyze or support analysis of complex data sets and find statistically significant, meaningful predictive patterns, relevant to program goals, that human intelligence could not identify on its own.
  • Option II: A Bachelor’s degree from an accredited college in a field of applied research such as Data Science, Machine Learning, Mathematics, Statistics, Business Analytics, Psychology, Computer Science, or Public Health that included 12 semester or 18 quarter units of coursework in data science, data engineering, predictive analytics, quantitative research methods, or statistical analysis -AND- Four (4) years of experience applying data engineering, machine learning, predictive analytics, and data management, to conduct or support hypothesis-driven data analysis to produce actionable recommendations to support data-driven program, policy, and operational decision-making. A Master’s or Doctoral degree from an accredited college or university in a field of applied research such Data Science, Machine Learning, Mathematics, Statistics, Business Analytics, Psychology, Public Health, or similar related fields may substitute for up to two (2) years of experience.
  • A valid California Class C Driver License or the ability to utilize an alternative method of transportation when needed to carry out job-related essential functions.
  • Successful clearance of Live Scan with the County of Los Angeles.
  • 4+ years of experience building data transformations and models in Databricks or Spark-based environments.
  • Strong knowledge of Medallion Architecture and curated model development.
  • Skilled in working with normalized datasets and applying entity resolution techniques to build clean, reliable analytic tables joined across systems (e.g., MDM-linked client records).
  • Experience using declarative syntax to manage and implement data transformations in such tools as Delta Live Tables, dbt, and Spark.
  • Proficient in SQL, Python, GitHub, and CI/CD workflows.
  • Experience developing and maintaining Databricks notebooks used in orchestrated jobs, including environment-based configuration using YAML/JSON.
  • Understanding of HIPAA, FERPA, and governance in health and social service data.
  • Ability to work across technical and program teams and contribute to shared engineering practices.

Nice To Haves

  • Familiar with Azure Data Factory, Synapse, and Terraform.
  • Experience supporting dashboards (Power BI, Tableau) and ensuring downstream data usability.

Responsibilities

  • Build and maintain semantic data models (silver and gold layers) in Spark/Databricks, primarily through ETL pipelines written in PySpark.
  • Understand and identify entity relationships among large collections of normalized backend tables to design accurate, denormalized, analyst-ready structures.
  • Contribute to schema and catalog design decisions, including naming conventions for static vs. live feeds and ad hoc data use cases. This includes creating and maintaining documentation that clarifies data model logic, table relationships, and mapping assumptions to support downstream users and internal knowledge transfer.
  • Collaborate with program and analytic teams to understand and translate both business rules used to define data fields as well as needs of analytic teams using semantic layer to produce reports and dashboards.
  • Collaborate with the Privacy Engineer to ensure analytic datasets align with RBAC policies, de-identification requirements, and data classification standards set for Departmental and Countywide use.
  • Contribute to, update, and maintain centralized code repositories used for data transformations.
  • Participate in Dev/Prod promotion workflows using GitHub, ensuring proper validation and configuration for CI/CD deployment. Apply expectations and version control to standardize, test, and document pipelines.
  • Collaborate with Evaluation and Reporting teams to align models with use cases and downstream needs.
  • Participate in integration of external and internal sources using Azure services and contribute to scalable, secure data pipelines.
  • Support data modeling standards and technical documentation practices.
  • Assist in onboarding and training analysts to use the semantic layer effectively.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service