Hadoop Data Engineer

QodePennsylvania, PA

About The Position

Hadoop Data Engineer responsible for designing, developing, and maintaining large-scale data processing systems within a distributed Hadoop ecosystem. The role focuses on enabling data-driven decision-making across banking operations, risk management, compliance, and customer analytics.

Requirements

  • Strong experience with Hadoop ecosystem (HDFS, MapReduce, Hive, HBase)
  • Strong experience with Apache Spark (Scala/Python)
  • Strong experience with SQL & NoSQL databases
  • Strong experience with ETL tools (Informatica, Talend, or similar)
  • Strong experience with Kafka or other streaming tools
  • Proficiency in Python / Java / Scala
  • Experience with Data warehousing concepts
  • Experience with Workflow orchestration tools (Airflow, Oozie)
  • Experience with Unix/Linux environments
  • Understanding of banking and financial services data
  • Strong problem-solving and analytical abilities
  • Excellent communication and collaboration skills
  • Ability to work in Agile/Scrum environments
  • Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field

Nice To Haves

  • Knowledge of cloud data platforms (AWS EMR, Azure Data Lake) is a plus
  • Exposure to risk, compliance, or fraud analytics is preferred

Responsibilities

  • Design, develop, and maintain scalable data pipelines using Hadoop ecosystem tools (HDFS, Hive, Spark, Sqoop, Kafka).
  • Build and optimize ETL/ELT processes to support data ingestion from multiple banking systems.
  • Develop and manage big data solutions for structured and unstructured data.
  • Collaborate with data analysts, data scientists, and business stakeholders to deliver data solutions.
  • Ensure data quality, integrity, and governance aligned with banking and regulatory standards.
  • Perform performance tuning and optimization of Hadoop/Spark jobs.
  • Implement data security controls to comply with financial regulations (e.g., PCI, SOX).
  • Support real-time and batch data processing frameworks.
  • Troubleshoot production issues and provide continuous support for data platforms.
  • Work with cloud platforms (e.g., AWS, Azure) for modern data solutions.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service