Hadoop Developer

Bright Vision TechnologiesEdison, NJ
Remote

About The Position

Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled Hadoop Developer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related technical discipline.
  • Five or more years of professional experience designing and operating big-data pipelines on Hadoop.
  • Strong hands-on expertise with Apache Spark (Scala, Python, or Java) in production environments.
  • Solid experience with Hive, HDFS, Sqoop, HBase, and the broader Hadoop ecosystem.
  • Hands-on experience with streaming data platforms such as Kafka, Spark Streaming, or Flink.
  • Strong SQL skills and experience working with both relational and NoSQL data stores.
  • Experience with workflow orchestration tools such as Airflow or Oozie.
  • Solid understanding of distributed systems concepts, including partitioning, replication, and fault tolerance.
  • Strong scripting skills in Python or Shell.
  • Excellent troubleshooting, debugging, and documentation skills.

Nice To Haves

  • Experience operating Hadoop on cloud platforms such as AWS EMR, Azure HDInsight, or Databricks.
  • Familiarity with modern lakehouse formats (Delta, Iceberg, Hudi).
  • Exposure to data governance tooling such as Apache Atlas or Collibra.
  • Experience with Kubernetes-based data platforms (Spark-on-K8s, Trino).
  • Hands-on experience with CI/CD and infrastructure-as-code in data engineering workflows.

Responsibilities

  • Design, develop, and operate end-to-end big-data pipelines on Hadoop, ingesting data from a diverse mix of relational, file-based, streaming, and API-driven sources.
  • Build robust ETL/ELT workflows using Apache Spark, Hive, Pig, and Sqoop, with strong attention to data quality, idempotency, error handling, and recoverability.
  • Develop high-throughput streaming data pipelines using Kafka, Spark Streaming, or Flink, and integrate them with downstream analytical and operational systems.
  • Optimize Spark and MapReduce jobs through careful tuning of partitioning, memory, serialization, and skew handling to meet demanding SLAs at minimal cost.
  • Design and maintain data models and storage layouts on HDFS, Hive, HBase, and modern lakehouse formats (Parquet, ORC, Delta, Iceberg, Hudi) to balance flexibility and performance.
  • Implement data governance, lineage, and quality controls in collaboration with data governance and security teams.
  • Build robust monitoring, alerting, and logging strategies for big-data pipelines, including job-level SLAs and proactive failure detection.
  • Partner with data scientists and analysts to deliver curated, reliable, and well-documented datasets that accelerate their work.
  • Automate pipeline orchestration using Airflow, Oozie, or similar workflow engines, with clean dependency management and clear ownership boundaries.
  • Continuously evaluate and adopt new technologies in the big-data and cloud ecosystem (EMR, Databricks, Snowflake, BigQuery) where they offer meaningful improvements.
  • Lead performance reviews and architecture audits of existing pipelines, proposing concrete refactoring and optimization initiatives.
  • Document data architectures, schemas, pipeline behaviors, and operational runbooks in a way that makes the platform supportable as the team scales.
  • Mentor junior engineers and contribute to the team’s engineering standards and best practices.

Benefits

  • Competitive base salary commensurate with experience, plus benefits.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service