Senior ML Data Platform Developer

LawZero•Montreal, QC

11h•Onsite

About The Position

We are seeking a visionary and highly technical Senior ML Data Platform Developer to architect, implement, scale, and maintain the data engine powering our next-generation frontier models. In this high-impact role, you will bridge the gap between cutting-edge AI research and high-performance engineering, treating the data platform as an internal product with our researchers as your primary customers. You will be responsible for designing a multi-tiered, ultra-low-latency storage architecture and building automated, petabyte-scale data processing pipelines. Our technical environment is not fixed and will evolve as our projects scale. We expect someone capable of evolving it, not only following industry trends, challenging it, and making sustainable decisions in close collaboration with our Research and Product teams.

Requirements

A bachelor’s degree in a relevant field (e.g., computer science, computer engineering, software engineering) is required.
5+ years of experience in designing, implementing, and managing web-scale storage, high-performance networking (HPC), or working within large-scale distributed ML data frameworks, with recent experience using e.g. Lustre, Ray, Apache Spark, workflow orchestrators, Apache Arrow, and/or Parquet.
Ability to collaborate effectively with cross-functional teams, document best practices, and stay updated with the latest advancements in large-scale data processing and software development.
Experience with workload managers (e.g., Ray, Kubernetes, SLURM).
Familiarity with containerization tools (e.g., Docker, Enroot).
Familiarity with data infrastructures and platforms (e.g., vector databases).

Responsibilities

Design and maintain a layered storage architecture and partner with the Research team to ensure seamless integration with the training pipelines.
Scale and automate the data processing stack to handle petabytes of data and ensure its smooth operation.
Ensure efficient use of compute resources, including GPU access for compute-intensive data processing tasks.
Assist the Infrastructure team in provisioning the compute and storage environments to support scaling.
Ensure all datasets, including the intermediate outputs of each transformation stage, are versioned, reproducible, and fully traceable to meet specific and dynamic experiment needs, and are accompanied by datasheets, in accordance with internal Data Governance policies.
Collaborate with the Research team and other teams to understand their self-service needs around dataset exploration, sampling, and analysis, and develop proper tooling.

Benefits

Comprehensive health benefits (including mental health and wellness management account)
20 days of vacation per year upon start
Employer contribution of 4% to your retirement savings, with no required employee match
Additional compensation totaling 8% of your salary to apply towards additional retirement savings or bonuses (independent of group and individual performance)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume