Data Engineer - AWS/Databricks - Mid Level

Acuity INC•Reston, VA

3d•Remote

About The Position

Acuity Inc. is seeking a highly skilled Data Engineer to join our Engineering Team, helping drive the design and delivery of AWS cloud-scale data platforms for federal clients. This role requires knowledge and/or experience with Spark, Delta Lake, and distributed data pipelines on Databricks. The ideal candidate brings both engineering and strategic insight into enterprise data modernization.

Requirements

4+ years of experience in data engineering and Agile analytics
4+ years of experience creating software for retrieving, parsing and processing structured and unstructured data
2+ years of experience building scalable ETL and ELT workflows for reporting and analytics
2 + years experience building enterprise data engineering solutions in the cloud, with preferred experience with cloud native technologies from AWS and Databricks
Experience with data quality, validation frameworks, and storage optimization strategies
BA or BS degree
Must be US Citizen with an ability to obtain and maintain US Suitability

Responsibilities

Build and maintain scalable PySpark-based data pipelines in Databricks notebooks to support ingestion, transformation, and enrichment of structured and semi-structured data.
Design and implement Delta Lake tables optimized for ACID compliance, partition pruning, schema enforcement, and query performance across large datasets.
Develop ETL and ELT workflows that integrate multiple source systems into a centralized, query-optimized data warehouse architecture.
Leverage Spark SQL and DataFrame APIs to implement business rules, dimensional joins, and aggregation logic aligned to warehouse modeling best practices.
Collaborate with data architects and engineers to implement cloud-native data solutions on AWS using S3, Glue, RDS, and IAM for secure, scalable storage and access control.
Optimize pipeline performance through intelligent partitioning, caching, broadcast joins, and adaptive query tuning.
Deploy and version data engineering assets using Git-integrated development workflows and automate deployment with CI/CD tools such as GitLab or Jenkins.
Monitor pipeline health, job execution, and cluster utilization using native Databricks tools and AWS CloudWatch, identifying bottlenecks and optimizing cost-performance tradeoffs.
Conduct technical discovery and mapping of legacy source systems, identifying required transformations and designing end-to-end data flows.
Implement governance practices including metadata tagging, data quality validation, audit logging, and lineage tracking using platform-native features and custom logic.
Support ad hoc data access requests, develop reusable data assets, and maintain shared notebooks that meet operational reporting and analytics needs across teams.