BigData Quality Engineer

Royal Bank of Canada•Dartmouth, NS

20d

About The Position

As part of the Global Functions Technology (GFT) within RBC's Technology and Operations division, you'll be at the heart of a team that extends its services across the organization, offering IT solutions that drive transformation and efficiency. Our collaborative efforts span various domains including Risk, Finance, HR, CAO, Audit, Legal, Compliance, Financial Crime, Capital Markets, Personal and Commercial Banking, and Wealth Management. Moreover, we're at the forefront of creating digital tools and platforms that foster better collaboration across the board. In this dynamic environment, you'll have the opportunity to work closely with leadership that values the recognition of achievements and the sharing of insights across teams to fuel continuous improvement. Joining our Finance IT Data As a Service team, you'll play a supporting role in Test Implementation and Test Automation on our cutting-edge Big Data Platform. This role involves working with the latest technologies and programming languages such as Cloudera, Spark, Scala, Unix, SQL, Python, Databricks, and leveraging GenAI and Agentic AI. Your primary responsibility will be to support QE Automation for our diverse client groups - executing tests, building automation scripts, and validating data pipelines under the guidance of senior engineers. This position offers a unique chance to develop your technical expertise in big data testing, finance data, and AI-driven quality engineering, all while delivering IT solutions in a fast-paced and ever-changing business landscape.

Requirements

1-2 years hands-on experience in QE, data testing, or data engineering in Big Data environments.
Strong foundation: Bachelor's degree in Computer Science, Engineering, Finance, or equivalent experience.
SQL proficiency: Spark SQL, Databricks SQL, Hive, Trino - writing and debugging queries for data validation and reconciliation.
Code & scripting: Python (PySpark, Pandas), Shell scripting, SQL; ability to build and maintain automated test scripts.
Big Data fundamentals: HDFS, Hive, Parquet, Spark; understanding of data pipelines
Databricks: Experience with notebooks, Delta Lake, Databricks SQL, and basic workflow execution.
Testing fundamentals: Test case design, defect lifecycle, regression testing, integration testing, source-to-target validation.
Generative AI: Practical use of GitHub Copilot or similar AI assistants; prompt engineering for test generation and SQL writing; awareness of Agentic AI concepts.
Finance data awareness: Basic understanding of positions, valuations, products, currencies, booking entities, and reporting feeds (or strong willingness to learn quickly).
DevOps/toolchain: JIRA, Git, Confluence, Linux command line; familiarity with CI/CD concepts.

Nice To Haves

Experience with data governance, audit readiness, and privacy-preserving testing.
Exposure to test automation frameworks (Robot Framework, pytest, Selenium) and CI/CD pipelines (GitHub Actions, Jenkins, UrbanCode Deploy).
AI/ML ops exposure: dataset curation, validation, synthetic data generation for tests.

Responsibilities

Execute test cases for data pipelines from ingestion to consumption, validating transformations, aggregations, and business rules across HDFS, Hive, Spark, and Databricks.
Validate ETL/ELT logic using Python (PySpark) and SQL - joins, filters, lookups, derived columns, and data mappings; perform source-to-target reconciliation.
Develop and maintain automated test scripts using AI, Python (Pandas, PySpark) and SQL
Contribute to automated test suites with CI/CD integration (GitHub Actions, Ansible Automation Platform; automate schema validation, row count checks, and data comparisons across environments.
Execute data quality checks - completeness, accuracy, consistency, and timeliness - on finance datasets
Test Spark jobs, Hive queries, and Data Lake pipelines; validate data landing, partition management, file integrity, and data freshness.
Leverage Generative AI tools (GitHub Copilot, Windsurf) to accelerate test case generation, SQL writing, and script development; support building Agentic AI workflows for autonomous test execution and intelligent triage.
Validate data feeding into ML models and test AI-generated outputs for correctness; support feature store and training data quality validation.
Understand and test finance data attributes: Trades, booking entities, currencies, GL accounts, product types, settlement dates; support regulatory and management reporting feed validation.
Partner with data engineers, SRE, and product teams to support testability and observability; log defects with clear root cause analysis in JIRA; maintain test documentation in Confluence.
Participate in Agile/Scrum ceremonies and contribute to shift-left testing practices and continuous testing workflows.