Principal Data Engineer - Red Hat Sales Data Management (Raleigh Office)

Red River•Raleigh, NC

18d

About The Position

Red Hat’s Global Sales Go-To-Market Strategy, Incentives & Data Analytics organization is seeking a Principal Data Engineer to work with a high degree of autonomy to lead the integration, automation, and optimization of complex data solutions. In this role, you will move beyond simple execution to provide technical leadership in data massaging, reconciliation, and architectural design. You will be responsible for building robust data pipelines, ensuring data governance, and collaborating with cross-functional teams to deliver high-quality data products that drive business decisions. What will you do? Advanced SQL Development: Write complex, highly optimized SQL queries across large datasets. You will be the subject matter expert for SQL query tuning and providing performance recommendations to the wider team. Python Automation: utilize advanced Python proficiency (including libraries such as Pandas and NumPy) to clean, massage, and merge raw datasets, automating complex data extraction and loading processes. Pipeline Orchestration: Design, schedule, and monitor robust data pipelines using tools like Airflow. You will take ownership of debugging workflows and resolving performance bottlenecks. Data Stewardship: Act as a guardian of data integrity. This includes leading initiatives on data governance, compliance, transformation, and validation audits. Automated Testing & CI/CD: Develop and maintain automated unit, end-to-end, and integration tests to ensure data accuracy. Participate actively in version control (Git) and CI/CD processes for deploying pipeline changes across environments. Cross-Functional Leadership: Partner with Analysts, Engineers, and Operations teams to understand data needs and ensure data accessibility for business stakeholders from the finance and operations organizations. Problem Solving: Apply strong analytical skills to translate complex algorithms into efficient software solutions, converting raw data into actionable insights by identifying trends, outliers, and distributions. What will you bring? Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.

Requirements

5+ years of experience as a Data Engineer, BI Engineer, or Systems Analyst in an enterprise environment with large, complex data sources.
Master’s degree in Computer Science, IT, Engineering, or equivalent experience.
Deep expertise in relational databases (PostgreSQL, MSSQL, etc.) and query optimization.
Strong programming skills for data querying, cleaning, and presentation, with hands-on experience in data-centric libraries.
Ability to manage multiple projects simultaneously in a fast-paced, distributed team environment across different time zones and cultures.
Exceptional logic and reasoning skills to troubleshoot complex data issues.
Ability to think strategically about data architecture and project planning.

Nice To Haves

Working knowledge of DBT (Data Build Tool) and Snowflake data warehousing is highly desirable.
Experience with Fivetran or similar integration tools.

Responsibilities

Advanced SQL Development: Write complex, highly optimized SQL queries across large datasets. You will be the subject matter expert for SQL query tuning and providing performance recommendations to the wider team.
Python Automation: utilize advanced Python proficiency (including libraries such as Pandas and NumPy) to clean, massage, and merge raw datasets, automating complex data extraction and loading processes.
Pipeline Orchestration: Design, schedule, and monitor robust data pipelines using tools like Airflow. You will take ownership of debugging workflows and resolving performance bottlenecks.
Data Stewardship: Act as a guardian of data integrity. This includes leading initiatives on data governance, compliance, transformation, and validation audits.
Automated Testing & CI/CD: Develop and maintain automated unit, end-to-end, and integration tests to ensure data accuracy. Participate actively in version control (Git) and CI/CD processes for deploying pipeline changes across environments.
Cross-Functional Leadership: Partner with Analysts, Engineers, and Operations teams to understand data needs and ensure data accessibility for business stakeholders from the finance and operations organizations.
Problem Solving: Apply strong analytical skills to translate complex algorithms into efficient software solutions, converting raw data into actionable insights by identifying trends, outliers, and distributions.

Benefits

Comprehensive medical, dental, and vision coverage
Flexible Spending Account - healthcare and dependent care
Health Savings Account - high deductible medical plan
Retirement 401(k) with employer match
Paid time off and holidays
Paid parental leave plans for all new parents
Leave benefits including disability, paid family medical leave, and paid military leave
Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume