Data Engineer

Public Partnerships | PPL•New York, NY

17d•$90,000 - $120,000•Remote

About The Position

It's fun to work in a company where people truly BELIEVE in what they're doing! We're committed to bringing passion and customer focus to the business. Public Partnerships LLC supports individuals with disabilities or chronic illnesses and aging adults, to remain in their homes and communities and “self” direct their own long-term home care. Our role as the nation’s largest and most experienced Financial Management Service provider is to assist those eligible Medicaid recipients to choose and pay for their own support workers and services within their state-approved personalized budget. We are appointed by states and managed healthcare organizations to better serve more of their residents and members requiring long-term care and ensure the efficient use of taxpayer funded services. Our culture attracts and rewards people who are results-oriented and strive to exceed customer expectations. We desire motivated candidates who are excited to join our fast-paced, entrepreneurial environment, and who want to make a difference in helping transform the lives of the consumers we serve. (learn more at www.pplfirst.com). Position Title: Data Engineer Reports to: Manager, Data Analytics Job Summary We are seeking an experienced Azure Data Engineer to design, build, and optimize scalable data pipelines and lake house architectures within the Azure ecosystem. This role is responsible for developing end-to-end data ingestion, transformation, and governance frameworks using Azure Data Factory, Databricks, and ADLS Gen2 to support enterprise analytics and reporting needs. The ideal candidate will have hands-on experience managing batch and near real-time data ingestion from diverse structured and unstructured sources, implementing Spark-based transformation processes across Bronze, Silver, and Gold data layers, and establishing robust data quality, security, and monitoring frameworks. This position also requires expertise in data governance, automation, and cost optimization to ensure high-performing, secure, and reliable data platforms aligned with organizational SLAs..

Requirements

Azure Data Factory for ETL/ELT operations.
Build and support self-service BI solutions using Azure Databricks and the Lakehouse architecture
Develop data pipelines and transformations using PySpark and Spark SQL
Enable analytics using Databricks AI/BI features and Genie for business users
Apply ML techniques for data insights, predictions, and feature engineering
Collaborate with analytics and business teams to deliver scalable, governed BI and ML solutions
Databricks (PySpark, SparkSQL) for data transformations.
Azure Data Lake Storage (ADLS Gen2).
SQL Server, PostgreSQL, Cosmos DB, CRM, and ERP systems.
Data Governance, RBAC, and ACLs for managing permissions.
CI/CD tools: Azure DevOps, GitHub Actions, Terraform.
Azure Monitoring, Alerting, and Logging tools.
Bachelor’s degree in Computer Science, Information Systems, Data Science, or a related field (Master’s preferred).
Minimum 4–5 years experience
Substantial professional experience in a related field may be considered in lieu of formal degree.

Nice To Haves

Must have AI/ML Skill

Responsibilities

Design and manage scalable data ingestion pipelines using Azure Data Factory and Azure Service Bus for batch and near real-time processing
Implement Bronze, Silver, and Gold data layer transformations using Databricks (PySpark, SparkSQL) within a lakehouse architecture
Integrate structured, semi-structured, and unstructured data sources using formats such as Parquet, Avro, and JSON into ADLS Gen2
Establish data quality frameworks and validation processes using Databricks and Azure Data Factory
Configure RBAC and ACL-based access controls to secure sensitive datasets and ensure compliance
Manage credential security with Azure Key Vault and enforce governance standards across data platforms.
Automate pipeline orchestration using Databricks Workflows and CI/CD tools such as Azure DevOps, GitHub Actions, and Terraform
Monitor pipeline health and performance using Azure Monitor, logging, and alerting to meet SLA requirements
Optimize ADLS storage performance and manage cloud costs using Azure Cost Management best practices
All other duties assigned