About The Position

We’re looking for an experienced Lead DevOps Engineer (ML Platform) who will bring focus and subject-matter expertise around designing and implementing machine learning infrastructure and automation tools (MLOps and DevOps). This is a unique opportunity to grow in the world of machine learning infrastructure and work with a team of passionate individuals committed to the mission of bringing ML to enterprise. At RBC Borealis, you’ll be joining a team that works directly with leading researchers in machine learning, has access to rich and massive datasets, and offers the computational resources to support ongoing development in areas such as reinforcement learning, unsupervised learning and computer vision. You can find out more about our research areas at rbcborealis.com.

Requirements

  • 5+ years of working experience with building and maintaining DevOps pipeline such as Jenkins, GitHub actions
  • Strong and relevant experience designing and implementing distributed systems and Machine Learning systems
  • Previous experience with MLOps orchestration tools such as AirFlow, KubeFlow, Dagster, Flyte, or MetaFlow
  • In-depth knowledge of various stages of the machine learning application deployment process
  • Experience with building tools and applications to automate various infrastructure and DevOps tasks
  • Proficiency with programming languages such as Python, Bash, or JavaScript
  • Solid understanding of the UNIX operating system
  • Implementing monitoring solutions to identify system bottlenecks and production issues
  • Knowledge of professional software engineering best practices for the full software development life cycle, including testing methods, coding standards, code reviews and source control management
  • Hands-on experience building and deploying hybrid environments on-prem and major cloud environments, such as AWS and Azure
  • Familiarity with machine learning frameworks such as PyTorch, TensorFlow and/or similar.

Responsibilities

  • Designing, building, and optimizing machine learning deployment tools and automation systems that operate the business’s data and ML applications
  • Designing and implementing best practices and standards for data and machine learning pipelines across the organization
  • Collaborating with engineers, and machine learning researchers to automate code analysis, build, integration and deployment of ML applications
  • Supporting applications and projects with infrastructure design decision, and monitoring solution
  • Building highly scalable, resilient cloud and on-premise systems for hosting machine learning systems using state-of-the-art technologies.

Benefits

  • Become part of a team that thinks progressively and works collaboratively. We care about seeing each other reach full potential
  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock options where applicable
  • Leaders who support your development through coaching and managing opportunities
  • Ability to make a difference and lasting impact from a local-to-global scale.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service