Data Engineering Lead

Sanofi•Toronto, ON

2d•Onsite

About The Position

Ready to push the limits of what’s possible? Join Sanofi in one of our corporate functions and you can play a vital part in the performance of our entire business while helping to make an impact on millions around the world. We are an innovative global healthcare company, driven by one purpose: we chase the miracles of science to improve people’s lives. Our team, across some 100 countries, is dedicated to transforming the practice of medicine by working to turn the impossible into the possible. We provide potentially life-changing treatment options and life-saving vaccine protection to millions of people globally, while putting sustainability and social responsibility at the center of our ambitions. Sanofi has recently embarked into a vast and ambitious digital transformation program. A cornerstone of this roadmap is the acceleration of its data transformation and of the adoption of artificial intelligence (AI) and machine learning (ML) solutions, to accelerate R&D, manufacturing and commercial performance and bring better drugs and vaccines to patients faster, to improve health and save lives. Who You Are: You are a dynamic Data Engineer Lead interested in challenging the status quo to ensure the seamless creation and operation of the data pipelines that are needed by Sanofi’s advanced analytic, AI and ML initiatives for the betterment of our global patients and customers. You are a valued influencer and leader who has contributed to making key datasets available to data scientists, analysts, and consumers throughout the enterprise to meet vital business use needs. You have a keen eye for improvement opportunities while continuing to fully comply with all data quality, security, and governance standards. Our vision for digital, data analytics and AI Join us on our journey in enabling Sanofi’s Digital Transformation through becoming an AI first organization. This means: AI Factory - Versatile Teams Operating in Cross Functional Pods: Utilizing digital and data resources to develop AI products, bringing data management, AI and product development skills to products, programs and projects to create an agile, fulfilling and meaningful work environment. Leading Edge Tech Stack: Experience build products that will be deployed globally on a leading-edge tech stack. World Class Mentorship and Training: Working with renowned leaders and academics in machine learning to further develop your skillsets

Requirements

Bachelor’s degree in computer science, engineering, or similar quantitative field of study
5+ years of relevant experience developing backend, integration, data pipelining, and infrastructure
Expertise in database optimization and performance improvement
Expertise in Python, PySpark, and Snowpark
Experience data warehousing and object-relational database (Snowflake and PostgreSQL) and writing efficient SQL queries
Experience in cloud-based data platforms (Snowflake, AWS)
Proficiency in developing robust, reliable APIs using Python and FastAPI Framework
Understanding of data structures and algorithms
Experience in DBT
Strong collaboration skills, willingness to work with others to ensure seamless integration of the server-side and client-side
English is a must

Nice To Haves

Experience in IICS is a plus
Experience in modern testing framework (SonarQube, K6 is a plus)
Knowledge of DevOps best practices and associated tools is a plus, especially in the setup, configuration, maintenance, and troubleshooting of associated tools: Containers and containerization technologies (Kubernetes, Argo, Red Hat OpenShift)
Infrastructure as code (Terraform)
Monitoring and Logging (CloudWatch, Grafana)
CI/CD Pipelines (JFrog Artifactory)
Scripting and automation (Python, GitHub, Github actions)
Experience with JIRA & Confluence
Workflow orchestration (Airflow)
Message brokers (RabbitMQ)

Responsibilities

Establish technical designs to meet Sanofi requirements aligned with the architectural and Data standards
Ownership of the entire back end of the application, including the design, implementation, test, and troubleshooting of the core application logic, databases, data ingestion and transformation, data processing and orchestration of pipelines, APIs, CI/CD integration and other processes
Fine-tune and optimize queries using Snowflake platform and database techniques
Optimize ETL/data pipelines to balance performance, functionality, and other operational requirements.
Assess and resolve data pipeline issues to ensure performance and timeliness of execution
Assist with technical solution discovery to ensure technical feasibility.
Assist in setting up and managing CI/CD pipelines and development of automated tests
Developing and managing microservices using python
Conduct peer reviews for quality, consistency, and rigor for production level solution
Design application architecture for efficient concurrent user handling, ensuring optimal performance during high usage periods
Own all areas of the product lifecycle: design, development, test, deployment, operation, and support