Principal Data Engineer

Fidelity Investments•Durham, NC

3d•Hybrid

About The Position

Position Description: Multiple Openings Available Develops Continuous Integration and Continuous Delivery (CI/CD) pipelines, including software configuration management, test automation, version control, and static code analysis. Programs using modern object-oriented programing languages -- Python and Spark. Works on Data Warehousing, Data mart concepts, and implementations in relational databases -- Oracle, SQL Server, Netezza, and Snowflake. Works with Extract, Transform, Load (ETL) technologies (Informatica). Works closely with the product owner, scrum master, architects, and other developers to design, build, test, and deliver software applications and features that impact the operational efficiency of compliance and risk groups.

Requirements

Bachelor’s degree (or foreign education equivalent) in Computer Science, Engineering, Information Technology, Information Systems, Mathematics, Physics, or a closely related field and five (5) years of experience as a Principal Data Engineer (or closely related occupation) performing data analysis, solution design, and development of data ingestion frameworks and pipelines in a financial services environment.
Or, alternatively, Master’s degree (or foreign education equivalent) in Computer Science, Engineering, Information Technology, Information Systems, Mathematics, Physics, or a closely related field and three (3) years of experience as a Principal Data Engineer (or closely related occupation) performing data analysis, solution design, and development of data ingestion frameworks and pipelines in a financial services environment.
Demonstrated Expertise (“DE”) designing and implementing highly scalable and high-performance data ingestion frameworks and pipelines to enable data integration, transformation, and analytics in a financial services domain, using Python, Amazon Web Services (AWS) (Lambda, EMR, S3, and EC2), Linux, and Shell scripting, Informatica, and Control-M.
DE performing data modeling and database design using Star, Snowflake, Data Vault techniques, and dimensional structures (types 1, 2, and 3) in a Data Warehouse environment with distributed frameworks (Snowflake and PySpark) and databases (Oracle and Snowflake).
DE translating business requirements into technical validations and examining data to determine accuracy and quality, using SQL/PLSQL; and automating the CI/CD for deployments using Stash, GitHub, Jenkins, and uDeploy.
DE designing and developing automation frameworks for ETL testing, Feed File to database comparison, and database to database comparison, using ICEDQ tool and Gherkin language; and testing Tableau dashboards using data permutation combinations.

Responsibilities

Architects, crafts, and develops highly scalable distributed data processing systems.
Collaborates with business and technology groups on formal and informal presentations.
Designs batch and streaming programs and adheres to standards and best practices for these databases.
Researches, designs, and develops computer and network software or specialized utility programs.
Provides Analytics and Reports services to the organization.
Enables Business Intelligence capabilities and creates data driven business solutions.
Analyzes user needs and develops software solutions.
Updates software or enhances existing software capabilities.
Collaborates to obtain information on project limitations and capabilities, performance requirements and interfaces.
Develops and oversees software system testing and validation procedures, programming, and documentation.