Data Engineer - Managing Consultant

GuidehouseTysons Corner, VA
Hybrid

About The Position

Design, build, and optimize scalable, production-grade data ingestion, transformation, and analytics-ready pipelines using Databricks (Delta Lake, Delta Live Tables, Auto Loader) and AWS services, enabling trusted, timely data access across enterprise use cases. Engineer standardized, repeatable data pipelines supporting batch and near real-time processing, integrating legacy data sources and modern cloud-native services to advance enterprise data availability and eliminate data access gaps. Execute full delivery lifecycle by supporting intake, discovery, source profiling, and technical design, while maintaining requirements traceability and aligning solutions to Architecture Review Board (ARB) and governance expectations. Implement and maintain governed data pipelines with embedded metadata, lineage, and data quality controls, ensuring pipelines meet defined technical, security, and documentation requirements before production deployment. Develop and operationalize data engineering frameworks that incorporate observability, monitoring, alerting, and resilient error handling to maintain platform stability and support ≥99.9% availability targets. Partner with platform engineering and cloud operations teams to integrate pipelines with AWS services (S3, Glue, Kafka/Kinesis, APIs), enabling secure, scalable data movement and cross-platform interoperability. Enable governed analytics and self-service data consumption through curated datasets, semantic layers, and SQL warehouse integration, supporting enterprise reporting, dashboards, and advanced analytics use cases. Apply security and compliance controls aligned to IRS cybersecurity policies, including RBAC/ABAC, data masking, encryption, and audit logging to protect sensitive data and maintain regulatory compliance.

Requirements

  • Bachelor’s degree is required
  • Minimum EIGHT (8) years of experience in data engineering within cloud-based environments
  • Minimum FOUR (4) years of hands-on Databricks experience designing scalable data pipelines
  • Advanced proficiency in Python, PySpark, and SQL for large-scale data processing and pipeline development in lakehouse architectures
  • Demonstrated experience building and optimizing data pipelines leveraging Delta Lake, medallion architecture, and modern data ingestion frameworks (batch and streaming)
  • Strong experience with AWS data platforms and services (e.g., S3, IAM, VPC, Glue, streaming frameworks) and integration with enterprise data ecosystems
  • Experience delivering production-ready data solutions incorporating metadata management, lineage, data quality, and observability frameworks
  • Familiarity with DevSecOps and CI/CD pipeline implementation, including automation, testing, and deployment within cloud data environments
  • Knowledge of data governance, security, and compliance requirements within regulated environments, including FISMA and FedRAMP High

Nice To Haves

  • Experience supporting federal data platforms or large-scale enterprise data modernization efforts, particularly within IRS, Treasury, or similar regulated environments
  • Hands-on experience with Databricks Unity Catalog, Delta Sharing, Genie, and Clean Rooms for governed data access and collaboration
  • Experience implementing streaming and near real-time data pipelines using Kafka, Kinesis, EventBridge, or similar technologies
  • Familiarity with Informatica EDC/Axon or enterprise metadata/catalog tooling
  • Experience with performance optimization techniques (Photon, Z-ordering, liquid clustering) to improve large-scale data workloads
  • Exposure to CI/CD automation using Terraform, GitHub Actions, and Databricks Asset Bundles for infrastructure and pipeline deployment
  • Advanced cloud or Databricks certifications (AWS, Databricks) in good standing

Responsibilities

  • Design, build, and optimize scalable, production-grade data ingestion, transformation, and analytics-ready pipelines using Databricks and AWS services.
  • Engineer standardized, repeatable data pipelines supporting batch and near real-time processing.
  • Execute full delivery lifecycle including intake, discovery, source profiling, and technical design.
  • Implement and maintain governed data pipelines with embedded metadata, lineage, and data quality controls.
  • Develop and operationalize data engineering frameworks with observability, monitoring, alerting, and error handling.
  • Partner with platform engineering and cloud operations teams to integrate pipelines with AWS services.
  • Enable governed analytics and self-service data consumption through curated datasets, semantic layers, and SQL warehouse integration.
  • Apply security and compliance controls aligned to IRS cybersecurity policies.

Benefits

  • Competitive compensation
  • Flexible benefits package
  • Medical, Rx, Dental & Vision Insurance
  • Personal and Family Sick Time & Company Paid Holidays
  • Parental Leave and Adoption Assistance
  • 401(k) Retirement Plan
  • Basic Life & Supplemental Life
  • Health Savings Account, Dental/Vision & Dependent Care Flexible Spending Accounts
  • Short-Term & Long-Term Disability
  • Student Loan PayDown
  • Tuition Reimbursement, Personal Development & Learning Opportunities
  • Skills Development & Certifications
  • Employee Referral Program
  • Corporate Sponsored Events & Community Outreach
  • Emergency Back-Up Childcare Program
  • Mobility Stipend
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service