Senior Platform Engineer - Global AI Platform

ManulifeToronto, ON
$113,000 - $163,000Hybrid

About The Position

This role designs and delivers a scalable, secure, cloud native AI platform that enables enterprise grade AI and agent powered solutions. The position focuses on building event driven, distributed services using Akka and modern MLOps practices to support continuous learning, experimentation, and governance. Working closely with architects, data scientists, and business leaders, this role ensures reliable, high performance AI infrastructure that accelerates innovation while meeting compliance and regulatory requirements.

Requirements

  • 5+ years in software engineering
  • 3+ years leading teams/projects in AI/ML or distributed systems.
  • Strong expertise in Akka and event-driven microservices at scale.
  • Hands-on experience with AI Foundry and AdaptiveML (or equivalent platforms for model lifecycle, orchestration, and continuous learning).
  • Proficiency in Scala or Java (Akka ecosystem), plus Python for ML tooling.
  • Experience with stream processing and data pipelines.
  • Solid MLOps background: model registries, feature stores, CI/CD for ML, containerization (Docker), orchestration (Kubernetes).
  • Cloud proficiency (AWS/Azure), Terraform or IaC, and secrets/IAM.
  • Deep understanding of distributed systems: consistency, partitioning, backpressure, resilience patterns.
  • Strong communication, documentation skills.

Nice To Haves

  • Experience with online learning, reinforcement learning, or active learning in production.
  • Knowledge of responsible AI frameworks, model risk management, and fairness/bias assessment.
  • Performance optimization for low-latency inference; GPU/accelerator utilization.
  • Experience in regulated industries (e.g., financial services/insurance) with audit and governance requirements.

Responsibilities

  • Builds and maintains high-performance, fault-tolerant, secure, and scalable AI platform services and abstractions that support diverse AI solutions with automation-first delivery.
  • Designs, builds, and maintains the technology platform's features and infrastructure, including hardware, software, and network components.
  • Integrate AKKA, AdaptiveML workflows for continuous/online learning, feature stores, model registries, and A/B experimentation.
  • Implement AI Foundry components for orchestration, feature engineering, model deployment, and governance.
  • Develop reusable reference patterns, inner-source components that meet reliability, security, and compliance standards.
  • Implement shared runtimes for multi agent coordination, state management, memory persistence, and messaging.
  • Design interoperable APIs/SDKs used by data scientists and developers to build agent powered applications.
  • Maintain and improve CI/CD pipelines and developer toolchains for AI services to enable rapid, compliant delivery.
  • Evaluate emerging AI/ML infrastructure capabilities; prototype and introduce tools that improve developer productivity and reliability.
  • Develop and operate scalable backend services supporting high traffic agent interactions, retrieval operations, and real time execution flows.
  • Use cloud native technologies (containers, orchestration, IaC, CI/CD) to deliver reliable, cost-efficient services.
  • Optimize runtime performance across CPU/GPU/accelerator workloads.
  • Monitors and resolves persistent platform issues when surfaced by technical support teams such as bottlenecks, connectivity problems, and system failures.
  • Considers compliance and regulatory requirements throughout the platform lifecycle.
  • Implements security measures, such as access controls, encryptions, and vulnerability assessments when applicable.
  • Partners with architects and business leaders to design and build robust platforms across all Global AI Platform capability layers.
  • Forms a holistic understanding of tools, key business concepts, and the data and cross-team dependencies.
  • Investigates new platform solutions to enhance service delivery experience.
  • Performs peer reviews of code / deliverables and analysis for continuous learning and continuous improvement.

Benefits

  • health, dental, mental health, vision, short- and long-term disability, life and AD&D insurance coverage, adoption/surrogacy and wellness benefits, and employee/family assistance plans.
  • various retirement savings plans (including pension and a global share ownership plan with employer matching contributions) and financial education and counseling resources.
  • generous paid time off program in Canada includes holidays, vacation, personal, and sick days, and we offer the full range of statutory leaves of absence.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service