Sr Data Engineer- Data Platform & AI Enablement

Citizens•Johnston, RI

13h

About The Position

The Enterprise Data Enablement team is seeking a Senior Data Engineer who can design, develop, and maintain secure, scalable, and efficient data pipelines and platforms. This role will focus on building and deploying data solutions across financial consumer business domains by leveraging existing and new data framework capabilities to acquire, transform, stream, and integrate data. The candidate will also contribute to innovative data engineering solutions, including AI/GenAI/Agentic AI-ready data capabilities, while collaborating with and supporting a team of data engineers in building scalable, secure, and intelligent data platforms.

Requirements

6–8+ years of experience in data engineering and distributed data processing technologies.
Hands-on experience with streaming technologies such as Apache Spark, Beam, or Flink.
Experience with message brokers such as Apache Kafka.
Experience working with microservices and batch processing systems.
Strong programming skills in Java and/or Scala; Python experience preferred.
Strong SQL development and performance optimization skills.
Solid knowledge of relational databases (Redshift, PostgreSQL, Snowflake) and NoSQL databases (MongoDB or similar).
Experience with CI/CD pipelines and version control systems such as Bitbucket and Git.
Understanding of cloud-based data processing, with AWS and/or Azure experience preferred.
Experience building data pipelines that support analytics, machine learning, and AI workloads.
Working knowledge of data engineering concepts supporting LLM-based applications, including retrieval pipelines, embeddings workflows, and unstructured data processing.
Familiarity with AI/GenAI concepts such as RAG, semantic search, document processing, and model inference workflows.
Understanding of data governance, security, lineage, and compliance requirements, particularly in regulated environments.
Strong analytical and problem-solving skills, with the ability to collaborate effectively within technical teams.

Nice To Haves

Experience with ETL development tools such as Talend or DataStage is a plus.
Experience with Java Spring Boot; familiarity with React, TypeScript, or Angular is a plus.
Exposure to workflow orchestration frameworks and automation patterns is a plus.
Exposure to vector databases, semantic models, or MLOps/LLMOps concepts is a plus.

Responsibilities

Design, build, and maintain reliable, efficient, and scalable data pipelines to acquire, transform, and store large datasets.
Develop robust data pipelines to collect, process, and compute metrics from various financial data sources while adhering to quality and development standards.
Contribute to application architecture and technical solutions, and help implement data framework patterns alongside senior engineers and architects.
Collaborate with cross-functional teams to deliver optimal data solutions that meet business and platform needs.
Develop and deploy high-quality, production-ready code.
Apply strong database design principles and data modeling techniques to translate business requirements into scalable data solutions.
Develop and optimize data models to support analytics, reporting, machine learning, and AI-driven use cases.
Support the implementation and enhancement of enterprise data frameworks and contribute to scalable solutions.
Identify opportunities to improve existing frameworks and help build reusable capabilities across the organization.
Troubleshoot and resolve data-related issues in a timely manner.
Execute unit testing for data pipelines, validate results, and ensure data quality and accuracy; partner with business users for User Acceptance Testing and support deployment activities.
Follow change management practices and ensure adherence to compliance and regulatory standards.
Design and build data pipelines and platform capabilities that support AI, Generative AI, and Agentic AI use cases, including model training, inference, retrieval, and orchestration workflows.
Enable AI-ready data foundations by developing high-quality, governed, and reusable datasets for machine learning, large language model (LLM), and intelligent automation solutions.
Develop and optimize pipelines for structured, semi-structured, and unstructured data to support GenAI use cases such as semantic search, document intelligence, and retrieval-augmented generation (RAG).
Partner with data scientists, ML engineers, architects, and product teams to integrate AI/GenAI capabilities into enterprise data platforms and workflows.
Implement metadata, lineage, governance, security, and access controls required for responsible AI and enterprise-scale GenAI adoption.
Ensure observability, reliability, performance, and data quality for data pipelines, including those supporting AI-enabled workflows.