Data Engineer

Anika SystemsLeesburg, VA
Remote

About The Position

Anika Systems is seeking a skilled Data Engineer to design, build, and optimize scalable data pipelines and platforms supporting federal clients. This role will play a critical part in enabling enterprise data strategies, supporting Office of the Chief Data Officer (OCDO) initiatives, and delivering high-quality, trusted data for analytics, reporting, and mission operations. This opportunity is 100% remote. The ideal candidate has hands-on experience with ETL/ELT pipelines, XBRL data processing, Apache Iceberg-based architectures, and advanced data optimization techniques such as materialized views and context-aware data engineering. This role also requires proficiency in AI tools and AI-assisted development workflows, along with experience building and deploying CI/CD pipelines for data and analytics platforms.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, Data Science, or related field.
  • 5+ years of experience in data engineering, ETL development, or data platform engineering.
  • Strong hands-on experience with ETL/ELT tools and frameworks
  • Strong hands-on experience with AWS data services (S3, Glue, Lambda, Redshift, etc.)
  • Strong hands-on experience with Apache Iceberg and modern data lake architectures
  • Experience designing and implementing CI/CD pipelines for data platforms and ETL workflows.
  • Demonstrated proficiency using AI tools and AI-assisted development workflows (e.g., LLM copilots, automated code generation, pipeline optimization tools).
  • Experience processing XBRL or complex financial/regulatory datasets.
  • Proficiency in SQL and Python.
  • Experience implementing materialized views and query optimization techniques.
  • Understanding of data modeling concepts and metadata management.
  • Familiarity with data governance, data quality practices, and data readiness for AI/ML use cases.
  • Ability to work in Agile, DevOps-oriented environments.
  • U.S. Citizenship required; ability to obtain and maintain a federal clearance.

Nice To Haves

  • Experience supporting federal agencies such as SEC, DHS, Federal Reserve System.
  • Familiarity with data catalog tools (e.g., Collibra, Alation, ServiceNow).
  • Experience with Apache Spark, Kafka, or other distributed data processing frameworks.
  • Experience enabling data pipelines for AI/ML or generative AI applications.
  • Knowledge of data maturity frameworks (e.g., EDM DCAM, TDWI).
  • Exposure to context engineering or semantic data layer design.
  • AWS or data engineering certifications.
  • Experience with infrastructure-as-code (IaC) tools (e.g., Terraform, CloudFormation) in support of CI/CD pipelines.

Responsibilities

  • Design, develop, and maintain robust ETL/ELT pipelines to ingest, transform, and deliver data across enterprise platforms.
  • Build scalable data ingestion frameworks for structured and semi-structured data, including XBRL filings and financial datasets.
  • Implement data transformation logic to support analytics, reporting, and regulatory use cases.
  • Ensure data pipelines are reliable, performant, and scalable in cloud environments.
  • Leverage AI-assisted development tools to accelerate pipeline development, testing, and optimization.
  • Develop and manage data solutions leveraging AWS services (e.g., S3, Airflow, DAGs, Glue, Lambda, Redshift).
  • Implement and optimize Apache Iceberg table formats for large-scale, ACID-compliant data lakes.
  • Support lakehouse architectures that unify data lakes and data warehouses.
  • Optimize data storage and retrieval strategies for performance and cost efficiency.
  • Enable data platforms that support AI/ML workloads and downstream generative AI use cases.
  • Design and implement CI/CD pipelines for data pipelines, infrastructure, and analytics code using tools such as GitHub Actions, GitLab CI, Jenkins, or AWS-native services.
  • Automate build, test, and deployment processes for ETL pipelines and data platform components.
  • Implement DataOps best practices, including version control, automated testing, environment promotion, and rollback strategies.
  • Ensure reproducibility, reliability, and governance of data pipeline deployments across environments.
  • Integrate AI-driven testing and monitoring tools to improve pipeline quality and reduce operational risk.
  • Design and implement materialized views and other performance optimization techniques to improve query efficiency.
  • Tune data pipelines and queries for performance, scalability, and cost.
  • Implement partitioning, indexing, and caching strategies aligned to workload patterns.
  • Develop pipelines to ingest, parse, and normalize XBRL (eXtensible Business Reporting Language) data.
  • Support regulatory and financial data use cases requiring high accuracy and traceability.
  • Ensure alignment with data standards and validation rules for financial reporting datasets.
  • Apply context engineering principles to ensure data is enriched with meaningful metadata, lineage, and business context.
  • Collaborate with Data Architects to support data modeling, schema design, and entity relationships.
  • Enable downstream analytics and AI use cases by structuring data for usability, discoverability, and governance.
  • Integrate pipelines with enterprise data catalogs and metadata management systems.
  • Support automated metadata capture, lineage tracking, and data quality monitoring.
  • Ensure alignment with data governance frameworks and standards established by OCDO organizations, including AI data readiness and traceability.
  • Collaborate with data architects, analysts, and business stakeholders to understand data needs and deliver solutions.
  • Participate in stakeholder listening campaigns, workshops, and data discovery efforts.
  • Work in Agile teams to iteratively deliver data capabilities and enhancements.
  • Contribute to identifying and implementing AI-driven efficiencies and automation opportunities across the data lifecycle.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service