Senior Spark Developer (Python, AWS, SQL)

LuxoftNew York, NY
153d$120,000 - $150,000

About The Position

We are seeking a highly skilled Spark Developer with strong experience in Python, AWS, and SQL to join our team. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing solutions, ensuring data quality, scalability, and performance. This role requires a solid background in distributed computing, cloud environments, and data engineering best practices. Compensation for NYC: 120-150000 USD Gross per year and based on interview results.

Requirements

  • 8+ years of experience in data engineering or backend development.
  • Hands-on experience with Apache Spark (PySpark) in large-scale data environments.
  • Strong proficiency in Python programming.
  • Expertise in SQL (including advanced queries, performance tuning, and optimization).
  • Experience working with AWS services such as S3, Glue, EMR, Lambda, Redshift, or Kinesis.
  • Understanding of data warehousing concepts and ETL best practices.
  • Strong problem-solving skills and ability to work in an agile, collaborative environment.

Nice To Haves

  • Experience with Databricks or similar Spark-based platforms.
  • Knowledge of streaming frameworks (Kafka, Flink).
  • Familiarity with CI/CD pipelines, Docker, Kubernetes, Terraform.
  • Exposure to data modeling (star schema, snowflake, data vault).
  • Experience in financial services / capital markets.

Responsibilities

  • Design, develop, and maintain scalable data pipelines using Apache Spark (batch and/or streaming).
  • Build, optimize, and manage ETL/ELT workflows integrating multiple data sources.
  • Develop data solutions in Python for data transformations, automation, and orchestration.
  • Leverage AWS services (S3, EMR, Glue, Lambda, Redshift, Kinesis, etc.) to implement cloud-native data platforms.
  • Write efficient SQL queries for data extraction, transformation, and reporting.
  • Ensure data quality, lineage, and governance across pipelines.
  • Collaborate with data engineers, architects, and analysts to deliver end-to-end data solutions.
  • Troubleshoot performance bottlenecks and optimize Spark jobs for speed and cost-efficiency.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service