Senior Data Engineer (Python/Scala)

ICFReston, VA
2d$98,614 - $167,644Remote

About The Position

ICF is looking for an enthusiastic Data Engineer to join our team and help with Data Management and Data Analysis. If you are Data Engineer interested in applying your expertise in Data Engineering in a consulting environment, then this may be the role for you. Job Location: This position requires that the job be performed in the United States. If you accept this position, you should note that ICF does monitor employee work locations and blocks access from foreign locations/foreign IP addresses, and also prohibits personal VPN connections. You may be asked to travel once a quarter to an office. Our core work hours are 10am - 4pm Eastern Time with the option to start earlier or work later depending on your time zone. However, please note our client is on the east coast and may sometimes start a meeting earlier than 10:00 which may require your participation.

Requirements

  • Bachelor’s Degree
  • 1+ years of experience working with tools like JIRA, GitHub, and Confluence.
  • 2+ years of experience with working on cloud platforms in AWS.
  • 2+ years of experience relational database and data warehousing concepts
  • 1+ years of experience with python and Scala, Spark technologies
  • 1+ years experience with data orchestration tools like NiFi, Airflow, Step Functions, etc.
  • 1+ years of experience with serverless or cloud-native analytics platforms
  • Candidate must be able to obtain and maintain a Federal Public Trust
  • Candidate must reside in the U.S., be authorized to work in the U.S., and all work must be performed in the U.S.
  • Candidate must have lived in the U.S. for three (3) full years out of the last five (5) years

Nice To Haves

  • Familiarity with data profiling, data catalogs, lineage tools, or observability platforms.
  • Prior experience or knowledge in contributing to or leading federated data governance
  • 5 years’ excellent problem-solving skills and end-to-end quantitative thinking.
  • Ability to self-organize, prioritize and conduct work on multiple projects under tight deadlines in a fast-paced environment.
  • Prior experience in consulting or healthcare is an advantage but not essential.
  • Good leadership and team-working skills.
  • Highly effective analytical, problem-solving, and decision-making capabilities.
  • Excellent communication and interpersonal skills to interface effectively at all levels of the business.
  • Organized, detailed oriented and able to prioritize and multi-task.

Responsibilities

  • Create Dashboards using AWS QuickSight with both visuals (charts, graphs, etc...) as well as tables for end users to slice and dice data to gain insights to various business processes.
  • Design and maintain scalable Spark-based data ingestion pipelines with adaptive change management to accommodate evolving business needs and technical requirements.
  • Lead centralized orchestration for both batch and event-driven workflows, ensuring seamless and efficient data movement throughout the platform.
  • Develop reusable templates and self-service solutions to enable efficient updates and enhancements to data models, empowering teams to manage changes independently.
  • Optimize distributed compute resources to enhance performance, reliability, and cost-effectiveness of data processing environments.
  • Define and enforce data contracts, manage schema versioning, and automate metadata processes to uphold reliable data standards and strong governance.
  • Collaborate in a federated model to operationalize essential compliance requirements, including handling personally identifiable information (PII), data retention, and maintaining consistent naming conventions across datasets.
  • Enforce robust data quality checks—including schema validation, handling of nulls, uniqueness, volume, freshness, and distribution metrics—as well as referential integrity across all datasets.
  • Embed orchestration of data quality checks at various checkpoints within the pipeline to ensure ongoing compliance and reliability.
  • Log, audit, and measure all quality results to provide transparency, accountability, and continuous improvement in data quality management.
  • Work with architects as a technical leader, contributing to the establishment of engineering standards, best practices, and guiding critical design decisions.
  • Partner with business and domain owners to understand domain data structure and translate requirements into reliable and scalable data products.
  • Lead incident triage, conduct root cause analysis, and drive continuous improvements in platform reliability and data quality.
  • Define and track key performance indicators (KPIs) for data quality, freshness, stability, adoption, and cost.
  • Demo work in both small and large virtual settings with clients and end users to obtain feedback on enhancing dashboards to meet business requirements.
  • Work within a SAFe scaled agile framework, collaborating with other team members to ensure solutions meet client needs with the highest quality.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service