Research Engineer - LLM Training & Alignment Systems

Huawei Technologies Canada Co., Ltd.•Kingston, ON

12d•CA$127,000 - CA$225,000

About The Position

Huawei Canada has an immediate 12-month contract opening for a Research Engineer. The Centre for Software Excellence Lab conducts pioneering research in software engineering, focusing on next-generation technologies. This team integrates industry best practices with cutting-edge academic research to address lifecycle software engineering challenges, including foundation model applications, software performance engineering, hyper-cluster programming, next-gen mobile OS, and cloud-native computing. This lab uniquely allows researchers to apply innovations directly to products affecting billions of customers while promoting open-source contributions, publications, conference participation, and collaborations to create a broader impact. Research, prototype, and build core infrastructure, tooling, and platforms to support the full lifecycle of large foundation model development, including data curation, model training, alignment, and evaluation, with a strong focus on scalability, efficiency, and research impact. Design and implement systems and workflows for SFT data curation, deduplication, and synthetic data generation, enabling high-quality training signals for large language models. Develop and optimize distributed training and alignment pipelines, including supervised fine-tuning, reward modelling, and reinforcement learning–based preference optimization (e.g., PPO, GRPO), across heterogeneous hardware platforms. Build and evaluate LLM evaluation and benchmarking frameworks to assess model quality, alignment, robustness, and regression across training iterations. Collaborate closely with systems, hardware, and research teams to integrate novel algorithms and software frameworks into in-house platforms, addressing challenges such as performance modelling, resource allocation, scheduling, fault tolerance, and communication efficiency. Work with leading industry and academic experts worldwide, contribute to impactful research publications, and drive innovation through prototype systems and patentable inventions that advance large-scale model training and serving.

Requirements

Hands-on experience with large language model training and alignment, including supervised fine-tuning (SFT), reward modeling, preference learning, and reinforcement learning–based optimization (e.g., PPO, GRPO), with a solid understanding of stability, scalability, and efficiency trade-offs.
Strong background in large-scale distributed training systems, with experience optimizing performance, resource utilization, and reliability across multi-node, multi-device, and heterogeneous hardware environments (e.g., GPU, NPU).
Experience building SFT data pipelines, including semantic deduplication and synthetic data generation, with a clear understanding of how data quality and distribution affect model behavior and alignment.
Experience designing or applying LLM evaluation and benchmarking frameworks, including automated evaluation, preference-based assessment, and regression analysis to measure model quality, alignment, and robustness.
Proficiency in Python, C/C++, or Go, with the ability to translate research ideas into scalable, reproducible prototype systems, and to communicate technical insights effectively across research and engineering teams.

Responsibilities

Research, prototype, and build core infrastructure, tooling, and platforms to support the full lifecycle of large foundation model development, including data curation, model training, alignment, and evaluation, with a strong focus on scalability, efficiency, and research impact.
Design and implement systems and workflows for SFT data curation, deduplication, and synthetic data generation, enabling high-quality training signals for large language models.
Develop and optimize distributed training and alignment pipelines, including supervised fine-tuning, reward modelling, and reinforcement learning–based preference optimization (e.g., PPO, GRPO), across heterogeneous hardware platforms.
Build and evaluate LLM evaluation and benchmarking frameworks to assess model quality, alignment, robustness, and regression across training iterations.
Collaborate closely with systems, hardware, and research teams to integrate novel algorithms and software frameworks into in-house platforms, addressing challenges such as performance modelling, resource allocation, scheduling, fault tolerance, and communication efficiency.
Work with leading industry and academic experts worldwide, contribute to impactful research publications, and drive innovation through prototype systems and patentable inventions that advance large-scale model training and serving.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume