Data Engineer

OasysTysons, VA
Hybrid

About The Position

The Data Engineer will be pivotal in designing, developing, and maintaining robust ETL (Extract, Transform, Load) processes to ensure seamless data flow between our diverse data sources and target data stores. You will be responsible for building and optimizing automated pipelines, ensuring data quality, and accommodating future data format changes. This position requires a strong technical foundation and a proactive approach to problem-solving.

Requirements

  • ETL development with Glue ETL, Python, Pyspark, RDS.
  • Solid understanding of data governance principles and data quality best practices.
  • Ability to work independently and as part of a collaborative team in an Agile environment.
  • Excellent problem-solving, analytical, and communication skills.
  • Bachelor’s degree in Computer Science, Information Systems, or a related field.
  • 10+ years of experience in data integration, ETL development, and data warehousing.
  • Strong proficiency in SQL and experience with relational databases (e.g. Oracle, PostgreSQL) and NoSQL databases.
  • Experienced with scripting languages such as Python or Shell scripting for automation and data manipulation.
  • Experienced with cloud technologies, including AWS Glue, Lambda, CloudFormation/Ansible, S3, Redshift, and EMR.
  • Experienced with Git, GitHub, CI/CD pipelines for DevOps and data engineering.
  • AWS certification (Minimum - Cloud Practitioner, AI Practitioner)
  • Must be a U.S. Citizen (No dual citizenship will be accepted)
  • Ability to obtain a favorable Public Trust investigation

Responsibilities

  • Pipeline Design & Development: Design, develop, and implement scalable and efficient ETL pipelines using modern data integration tools and technologies.
  • Data Transformation: Transform and cleanse data from various sources (databases, APIs, cloud storage, etc.) to ensure accuracy, consistency, and compliance with data governance policies.
  • Data Store Management: Develop and maintain optimized data models and data warehousing solutions utilizing platforms like Oracle, PostgreSQL, Redshift, and EMR. Focus on performance tuning and query optimization.
  • Automation & Monitoring: Build and maintain automated ETL jobs, incorporating robust monitoring and alerting mechanisms for proactive issue detection and resolution.
  • Data Quality Assurance: Implement data quality checks and validation rules throughout the ETL process to guarantee data integrity.
  • Documentation: Create and maintain comprehensive documentation for ETL processes, data models, and system configurations.
  • Collaboration: Work closely with business stakeholders and other teams to understand data requirements and deliver effective solutions.
  • Future-Proofing: Proactively assess and implement changes to data integration processes to accommodate evolving data formats, sources, and business needs. Ensuring designs accommodate potential future data changes.
  • All other duties as assigned by leadership.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service