Principal Software Engineer

WorkdayToronto, ON
Hybrid

About The Position

This is an opportunity to be part of a growth team focused on MLOps. We build ML capabilities into our products, and you would be building part of the next generation of Workday technology. We believe predictive products can be as ground-breaking to the next generation of technology as mobile was to the last. As a Principal Software Engineer you will help develop ML-powered features and experiences for every user across our HR & Talent product portfolio. You will work closely with ML engineers and other software teams to deliver critically important infrastructure and software frameworks that enable machine learning across Workday’s product ecosystem. You will apply modern MLOps, CloudOps, and data engineering stacks to enable development, training, deployment, and lifecycle management of a variety of ML capabilities; supervised and unsupervised, deep learning and classical. You will be responsible for the design & development of new APIs/microservices and deploy them using Python, Go, Terraform, and Kubernetes at scale. You will use Workday’s vast computing resources on rich, exclusive datasets to deliver value that transforms the way our end-users experience WD. We will challenge you to apply your best creative thinking, analysis, problem-solving, and technical abilities to make an impact on thousands of enterprises and millions of people.

Requirements

  • 6 or more years of validated industry experience.
  • Bachelor’s and/or Master’s degree in Computer Science or Computer Engineering.
  • Strong software engineering experience with designing and building scalable, distributed systems.
  • Deep understanding of cloud computing, cloud infrastructure, and distributed systems; experience with AWS and GCP.
  • Experience developing microservices, APIs, robust cloud service, large-scale web applications, managing CI/CD workflows.
  • Proficiency with Python, Go, and infrastructure-as-code tools like Terraform.
  • Experience running and maintaining Kubernetes clusters in production.
  • Ensure security and compliance of cloud platforms, implementing best practices for encryption, data protection, and access control.

Nice To Haves

  • Experience with large-scale ML data pipelines and data lakes.
  • Ability to think across layers of the ML stack, from infrastructure to model deployment.
  • Experience developing monitoring and alerting systems for ML infrastructure.
  • Understanding of agentic AI concepts; experience with LangChain and LangSmith is preferred.
  • Proven leadership or mentoring experience.

Responsibilities

  • Lead the design and implementation of high-throughput microservices and APIs (Python/Go) that serve as the backbone for Workday’s ML ecosystem.
  • Build and optimize a unified ML development experience using Kubeflow, Kubernetes (EKS/GKE), and specialized compute orchestration (CPUs/GPUs).
  • Own the end-to-end lifecycle of cloud-based services, utilizing Infrastructure as Code (Terraform) to build resilient, self-healing environments.
  • Lead architecture reviews, code reviews, and technology evaluations to ensure our systems meet 99.99% reliability standards.
  • Design the architectural patterns and observability frameworks required to support emerging Agentic AI systems and LLM-based applications.
  • Partner with data scientists, ML engineers, and architects to translate complex data needs into elegant, maintainable software solutions.
  • Research and drive adoption of new infrastructure tools with a focus on reliability, security, and enterprise-grade scale.

Benefits

  • Workday Bonus Plan or a role-specific commission/bonus
  • Annual refresh stock grants
  • Comprehensive benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service