Data Engineer

AretumMcLean, VA
Remote

About The Position

Aretum is a mission-driven organization committed to delivering innovative, technology-enabled solutions to our customers across defense, civilian, and homeland security sectors. Our teams work at the intersection of strategy, technology, and transformation, helping agencies solve their most critical challenges. We believe in investing in our people and creating a culture where collaboration, inclusion, and professional growth are at the forefront. Aretum is seeking a skilled and highly motivated Data Engineer. As a Data Engineer, you will build and manage all data ingestion, transformation, reconciliation, and analytics pipelines. Due to the nature of our work as a federal consulting organization, employees may be expected to handle Controlled Unclassified Information (CUI) and must adhere to applicable safeguarding and compliance requirements.

Requirements

  • Programming: Python (primary), SQL (advanced), optional Scala
  • Data Processing Frameworks: Apache Spark, AWS EMR, Databricks (preferred)
  • ETL/ELT Design: Pipeline orchestration, incremental vs full loads, data validation
  • API Integration: REST APIs, JSON parsing, pagination, authentication (OAuth2)
  • FHIR Data Handling: Patient, MedicationRequest, Observation, etc.
  • Data Modeling: Relational and semi-structured schema design
  • Data Quality & Validation: Deduplication, reconciliation logic, anomaly detection
  • Streaming vs Batch Processing: Understanding tradeoffs and implementation patterns
  • Storage Technologies: S3, relational DBs, NoSQL basics
  • Performance Optimization: Partitioning, parallelization, query tuning
  • Versioning & Lineage: Data version control, reproducibility of datasets
  • Public Trust Eligibility Required
  • U.S. Work Authorization Due to federal contract requirements, only U.S. citizens are eligible for this position.
  • This position supports a federal government contract and requires the ability to obtain and maintain a Public Trust or Suitability Determination, depending on the agency’s background investigation requirements.

Responsibilities

  • Ingest data from FHIR APIs, CDW, and other VA sources
  • Normalize and reconcile medication and patient data
  • Build transformation pipelines for risk scoring inputs
  • Support batch and near-real-time processing
  • Ensure data quality, consistency, and traceability

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off
  • Family Leave (Maternity, Paternity)
  • Short Term & Long-Term Disability
  • Training & Development
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service