AWS Python Developer with Pyspark

Polar IT ServicesNewark, NJ
6dHybrid

About The Position

We are seeking an experienced Python Developer with strong expertise in AWS and PySpark to join our data engineering team. The ideal candidate will have hands-on experience developing scalable data pipelines, processing large data sets, and integrating with cloud-based environments. This role requires excellent problem-solving skills and a strong understanding of distributed data processing frameworks.

Requirements

  • 10+ years of experience in software development with a strong focus on Python.
  • Hands-on experience with PySpark for distributed data processing.
  • Solid understanding of AWS cloud services such as S3, Glue, Lambda, EMR, Redshift, and Athena.
  • Strong experience in ETL development and data pipeline orchestration.
  • Familiarity with SQL and relational/non-relational databases.
  • Excellent analytical, debugging, and communication skills.
  • Bachelor’s degree in Computer Science, Data Engineering, or a related field.

Nice To Haves

  • Experience with Airflow, Databricks, or other workflow management tools.
  • Knowledge of CI/CD pipelines and version control tools like Git.
  • Exposure to data lake or data warehouse architectures.
  • Familiarity with Docker or Kubernetes for deployment.

Responsibilities

  • Design, develop, and maintain data pipelines and ETL workflows using Python, PySpark, and AWS services.
  • Build and optimize large-scale data processing and data transformation solutions.
  • Integrate various data sources and ensure data quality, performance, and reliability.
  • Collaborate with data engineers, analysts, and architects to deliver end-to-end data solutions.
  • Implement best practices for code optimization, error handling, and data validation.
  • Participate in code reviews, documentation, and deployment automation.
  • Ensure adherence to data security and compliance standards.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service