Mastercard-posted 17 days ago
Full-time • Mid Level
O'fallon, MO
5,001-10,000 employees

At Mastercard, we’re building the future of data integrity and trust. As part of our commitment to innovation and excellence, we’re seeking a seasoned Data Quality Engineer to lead the development of scalable, cloud-based data quality solutions. This role is pivotal in ensuring our data pipelines meet the highest standards of accuracy, consistency, and completeness—aligned with Mastercard’s global best practices. You’ll define requirements for new applications, customize solutions to Mastercard specifications, and provide mentorship and technical leadership to a high-performing team. If you thrive in a fast-paced, collaborative environment and are passionate about data quality, this is your opportunity to make a global impact.

  • Define and Develop: Build scalable data quality frameworks using PySpark and Databricks, aligned with Mastercard standards.
  • Integrate and Optimize: Design and manage ETL/ELT pipelines using Apache NiFi, Hadoop, and AWS (S3, Redshift, Glue, Lambda).
  • Monitor and Detect: Implement data quality monitoring tools (Informatica, Talend, Great Expectations) and Databricks anomaly detection.
  • Orchestrate and Automate: Demonstrate proficiency in workflow orchestration using Apache Airflow, integrating seamlessly with Databricks, AWS Step Functions, and Unix shell scripting to automate data quality checks and streamline operational processes for enhanced efficiency.
  • Collaborate and Align: Work closely with data engineers, scientists, and business stakeholders to define standards and resolve issues.
  • Document and Report: Maintain clear documentation of processes, metrics, and anomaly outcomes for stakeholder insights.
  • Mentor and Lead: Provide guidance and training to junior team members, fostering a culture of excellence and innovation.
  • Years in data engineering or data quality roles within big data environments.
  • PySpark and Databricks for pipeline development and optimization.
  • AWS services (S3, Redshift, Glue, Lambda) for cloud integration.
  • Apache NiFi and Hadoop ecosystem (HDFS, MapReduce, Hive).
  • Data quality tools: Informatica, Talend, Great Expectations.
  • Advanced SQL for validation and reconciliation.
  • Apache Airflow, AWS Step Functions and Unix shell scripting for automation.
  • Proven ability to resolve complex data quality issues.
  • Strong communication skills and cross-functional teamwork.
  • Bachelor’s in Computer Science, Data Engineering, or related field (Master’s preferred).
  • Experience with Databricks Data Quality Monitoring and anomaly detection.
  • Familiarity with machine learning-based anomaly techniques.
  • Understanding of data governance (GDPR, CCPA).
  • Proficiency in Python or Scala.
  • Exposure to CI/CD pipelines and DevOps practices.
  • insurance (including medical, prescription drug, dental, vision, disability, life insurance)
  • flexible spending account and health savings account
  • paid leaves (including 16 weeks of new parent leave and up to 20 days of bereavement leave)
  • 80 hours of Paid Sick and Safe Time, 25 days of vacation time and 5 personal days, pro-rated based on date of hire
  • 10 annual paid U.S. observed holidays
  • 401k with a best-in-class company match
  • deferred compensation for eligible roles
  • fitness reimbursement or on-site fitness facilities
  • eligibility for tuition reimbursement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service