Data Engineering Associate - Associate

Morgan Stanley•New York, NY

About The Position

We’re seeking someone to join our team as a Data Engineering Associate In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities This is a Data & Analytics Engineering position at Associate level, which is part of the job family responsible for providing specialist data analysis and expertise that drive decision-making and business insights as well as crafting data pipelines, implementing data models, and optimizing data processes for improved data accuracy and accessibility, including applying machine learning and AI-based techniques. Morgan Stanley Since 1935, Morgan Stanley is known as a global leader in financial services, continuously evolving and innovating to better serve our clients and our communities in more than 40 countries around the world What you will do The Data Engineer will support the design and development of big data pipelines and distributed data processing systems for enterprise analytics and data products. This role is ideal for early career engineers with strong fundamentals in Python, Apache Spark, distributed systems, and cloud-based data platforms, and an interest in using AI assisted tools to improve productivity and code quality.

Requirements

Bachelor’s degree in computer science, Engineering, or a related field (or equivalent practical experience).
3 + years of experience or strong academic/project experience in data engineering or big data systems.
Working knowledge of Python for data processing and automation.
Working knowledge of Apache Spark (coursework, internships, or projects acceptable).
Understanding of distributed systems fundamentals (e.g., partitioning, fault tolerance, scalability).
Familiarity with cloud platforms (e.g., AWS, Azure, or GCP) and cloud-based data services.
Exposure to AI-powered developer tools for coding, testing, or documentation.
Strong problem-solving skills and willingness to learn in a fast-paced environment.
Comfortable working in Linux/UNIX environments.

Responsibilities

Assist in building and maintaining large scale data pipelines using Apache Spark and distributed processing frameworks.
Support batch and streaming data ingestion and transformation workflows.
Develop and maintain Python-based data processing and validation components.
Use AI-assisted development tools (e.g., coding copilots, code review and debugging tools) to improve development efficiency and quality.
Monitor data pipelines and help ensure data quality, reliability, and observability.
Collaborate with senior engineers, architects, and analysts on scalable data models and ingestion patterns.
Participate in testing, deployment, and CI/CD activities for data pipelines.
Document data flows, pipelines, and operational procedures.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume