Data Engineer Pyspark & BigQuery

TATA Consulting Services•Phoenix, AZ

40d•$95,000 - $115,000

About The Position

Data Engineer Pyspark & BigQuery Job Title : Data Engineer Pyspark & BigQuery Experience Required - 8+ Years Must Have Technical/Functional Skills Design, develop, and maintain scalable ETL/ELT pipelines using PySpark, Airflow, and GCP-native tools. Build and optimize data warehouses and analytics solutions in BigQuery. Implement and manage workflow orchestration with Airflow/Cloud Composer. Write complex SQL queries for data transformations, analytics, and performance optimization. Ensure data reliability, security, and governance across pipelines. Conduct performance tuning and cost optimization of BigQuery and PySpark workloads. Collaborate with analysts and product teams to deliver reliable data solutions. Troubleshoot, debug, and resolve production issues in large-scale data pipelines. Contribute to best practices, reusable frameworks, and automation for data engineering.5+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be a addon Expert on Distributed ecosystem Hands-on experience with programming using Python Expert on Hadoop and Spark Architecture and its working principle Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data Experience in UNIX shell scripting Ability to design and develop optimized Data pipelines for batch and real time data processing Should have experience in analysis, design, development, testing, and implementation of system applications o Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows. Roles & Responsibilities Develop and maintain data pipelines using BigData processes Focus on ingesting, storing, processing, and analyzing large datasets Design, develop, and maintain application. Able to communicate complex data structures and associated components. Able to Design, code, test, maintain, and document application components. Lead reviews of colleagues' work. Defines test condi tions based on the functional and non-functional requirements provided. Has deep understanding of the core tools used in the planning, analyzing, crafting, building, testing, configuring, and maintaining of assigned application(s). Deep understanding of infrastructure, technologies, and components. Able to assess and interview team members to identify and develop talent. Making impactful changes by influencing leadership and making timely decisions. Monitors system performance and availability and improves software quality through root cause analysis. Proven track record to influence technological growth across teams.

Requirements

Design, develop, and maintain scalable ETL/ELT pipelines using PySpark, Airflow, and GCP-native tools.
Build and optimize data warehouses and analytics solutions in BigQuery.
Implement and manage workflow orchestration with Airflow/Cloud Composer.
Write complex SQL queries for data transformations, analytics, and performance optimization.
Ensure data reliability, security, and governance across pipelines.
Conduct performance tuning and cost optimization of BigQuery and PySpark workloads.
Collaborate with analysts and product teams to deliver reliable data solutions.
Troubleshoot, debug, and resolve production issues in large-scale data pipelines.
Contribute to best practices, reusable frameworks, and automation for data engineering.5+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be a addon
Expert on Distributed ecosystem
Hands-on experience with programming using Python
Expert on Hadoop and Spark Architecture and its working principle
Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes),
optimizing joins while processing huge amount of data
Experience in UNIX shell scripting Ability to design and develop optimized Data pipelines for batch and real time data processing
Should have experience in analysis, design, development, testing, and implementation of system applications
Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.

Responsibilities

Develop and maintain data pipelines using BigData processes
Focus on ingesting, storing, processing, and analyzing large datasets
Design, develop, and maintain application.
Able to communicate complex data structures and associated components.
Able to Design, code, test, maintain, and document application components.
Lead reviews of colleagues' work.
Defines test condi tions based on the functional and non-functional requirements provided.
Has deep understanding of the core tools used in the planning, analyzing, crafting, building, testing, configuring, and maintaining of assigned application(s).
Deep understanding of infrastructure, technologies, and components.
Able to assess and interview team members to identify and develop talent.
Making impactful changes by influencing leadership and making timely decisions.
Monitors system performance and availability and improves software quality through root cause analysis.
Proven track record to influence technological growth across teams.

Benefits

Discretionary Annual Incentive.
Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans.
Family Support: Maternal & Parental Leaves.
Insurance Options: Auto & Home Insurance, Identity Theft Protection.
Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement.
Time Off: Vacation, Time Off, Sick Leave & Holidays.
Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume