Data Engineer II, Transportation Execution, Speed Team

Amazon•Tempe, AZ

2d•Onsite

About The Position

Are you passionate about building data infrastructure that powers real-time logistics decisions for millions of customers? The Global Transportation Services (GTS) Speed team is looking for a Data Engineer II to own critical data pipeline components that drive delivery speed optimization across Amazon's transportation network. In this role, you will independently design, build, and operate scalable data infrastructure solutions that integrate with multiple heterogeneous data sources. You will own end-to-end pipeline development — from extraction and transformation to loading and serving — ensuring data is delivered reliably and efficiently for reporting, analysis, and machine learning workloads. You will manage multiple Redshift clusters supporting the transportation organization's reporting needs, make technical decisions on data modeling and architecture for your domain, and collaborate with cross-functional teams to translate business requirements into high-impact data solutions.

Requirements

3+ years of data engineering experience
Experience with distributed systems as it pertains to data storage and computing
Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field
2+ years of experience writing production data pipelines using SQL and Python
Experience designing and implementing ETL/ELT solutions with large-scale data processing

Nice To Haves

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with AWS services including S3, Redshift, Sagemaker, EMR, Kinesis, Lambda, and EC2
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in mentoring, leading, or managing more junior engineers
Master's degree in Engineering, Computer Science, or a related field
Experience with data orchestration frameworks such as Apache Airflow, AWS Step Functions, or Glue Workflows
Experience with infrastructure-as-code tools (CloudFormation, Terraform, or CDK)

Responsibilities

Own end-to-end design, development, and operation of ETL/ELT pipelines that extract, transform, and load data from diverse sources using SQL, Python, and AWS big data technologies
Manage and optimize multiple production Redshift clusters, including performance tuning, capacity planning, and cost optimization to support transportation org reporting needs
Lead technical design discussions with Product teams, Data Scientists, Software Developers, and Business Intelligence Engineers to define data infrastructure requirements and deliver scalable solutions
Define and enforce data engineering best practices for your domain, including code quality standards, testing frameworks, documentation, and deployment processes
Conduct thorough code reviews and mentor junior data engineers on technical problem-solving, coding standards, and AWS best practices
Proactively identify and resolve scalability bottlenecks, re-designing infrastructure for greater reliability and performance
Evaluate emerging AWS technologies and lead proof-of-concept efforts to enhance data platform capabilities
Own production operations including release management, incident response, and continuous improvement of data delivery systems

Benefits

health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
401(k) matching
paid time off
parental leave
sign-on payments
restricted stock units (RSUs)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume