Data Engineer

Synechron Inc•Pittsburgh, PA

74d

About The Position

We are seeking a talented and detail-oriented Data Engineer to design, develop, and maintain scalable data pipelines and big data solutions. The ideal candidate will have strong expertise in Python, PySpark, Apache Spark, and Big Data technologies. Candidate will work closely with data scientists, analysts, and stakeholders to ensure data is collected, processed, and made available for analytics and decision-making.

Requirements

Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.
Experience in data engineering, big data, or similar roles.
Strong programming skills in Python, especially with PySpark.
Hands-on experience with Apache Spark (preferably version 2.x or 3.x).
Experience working with large-scale data processing frameworks and data lakes.
Familiarity with SQL and NoSQL databases.
Knowledge of distributed computing concepts and data storage solutions.
Experience with cloud platforms such as AWS, Azure, or Google Cloud is a plus.
Knowledge of data modeling, data warehousing, and ETL best practices.
Proficiency with data orchestration tools like Apache Airflow or Luigi.
Strong problem-solving skills and attention to detail.
Excellent communication and collaboration skills.

Nice To Haves

Experience with Kafka, Hadoop, Hive, or other big data tools.
Familiarity with containerization (Docker) and orchestration (Kubernetes).
Understanding of data security, privacy, and compliance standards.
Knowledge of streaming data processing and real-time analytics.

Responsibilities

Design, develop, and optimize large-scale data pipelines using Spark and Python (PySpark).
Build and maintain scalable data architecture and infrastructure in big data environments.
Collaborate with data scientists and analysts to understand data requirements and deliver reliable data solutions.
Extract, transform, and load (ETL) data from various sources into data lakes or data warehouses.
Implement data quality checks, validation, and monitoring processes to ensure data integrity.
Optimize Spark jobs for performance and resource utilization.
Automate data workflows and pipelines using scheduling tools like Apache Airflow, Luigi, or similar.
Work with cloud-based big data services (e.g., AWS EMR, Azure HDInsight, Google Cloud Dataproc) as applicable.
Document data pipelines, architecture, and processes for future maintenance and compliance.
Stay updated with emerging big data technologies and industry best practices.

Benefits

A highly competitive compensation and benefits package.
A multinational organization with 58 offices in 21 countries and the possibility to work abroad.
10 days of paid annual leave (plus sick leave and national holidays).
Maternity & paternity leave plans.
A comprehensive insurance plan including medical, dental, vision, life insurance, and long-/short-term disability (plans vary by region).
Retirement savings plans.
A higher education certification policy.
Commuter benefits (varies by region).
Extensive training opportunities, focused on skills, substantive knowledge, and personal development.
On-demand Udemy for Business for all Synechron employees with free access to more than 5000 curated courses.
Coaching opportunities with experienced colleagues from our Financial Innovation Labs (FinLabs) and Center of Excellences (CoE) groups.
Cutting edge projects at the world's leading tier-one banks, financial institutions and insurance firms.
A flat and approachable organization.
A truly diverse, fun-loving, and global work culture.