Lead Full Stack Software Engineer

ManulifeToronto, ON
$113,260 - $210,340Hybrid

About The Position

The Lead Full Stack Software Engineer brings deep expertise across both front-end and back-end development, playing a key role in building next-generation products and services. This individual will lead the design, development, testing, and implementation of new features that enable scalable, high-quality software solutions while setting technical direction and best practices.

Requirements

  • 7–10+ years in software engineering; 3+ years leading teams/projects in AI/ML or distributed systems.
  • Hands-on experience building or integrating technology supporting AI governance (model governance) and MLOps capabilities for model lifecycle management (registry, approvals, monitoring), orchestration, and continuous learning (e.g., AI Foundry, AdaptiveML, or equivalent).
  • Proficiency in Scala or Java (Akka ecosystem), plus Python for ML tooling.
  • Experience with stream processing and data pipelines.
  • Solid MLOps background: model registries, feature stores, CI/CD for ML, containerization (Docker), orchestration (Kubernetes).
  • Cloud proficiency (AWS/Azure), Terraform or IaC, and secrets/IAM.
  • Deep understanding of distributed systems, observability stack and resilience patterns.
  • Strong communication, documentation, and stakeholder management skills.

Nice To Haves

  • Experience with online learning, reinforcement learning, or active learning in production.
  • Knowledge of responsible AI frameworks, model risk management, and fairness/bias assessment.
  • Performance optimization for low-latency inference; GPU/accelerator utilization.
  • Experience in regulated industries (e.g., financial services/insurance) with audit and governance requirements.
  • ModelOp (ModelOps) and Dynamo implementation experience supporting AI governance / model governance and model lifecycle management.

Responsibilities

  • Designs, builds, and maintains the technology platform's features and infrastructure, including hardware, software, and network components.
  • Implement and integrate technology solution supporting AI governance (model governance) for orchestration, feature engineering, model deployment controls, approval workflows, and audit-ready evidence capture.
  • Design and implement policy-as-code guardrails (access, usage, data handling, deployment standards) and enforcement points across the AI lifecycle.
  • Build capabilities for model lineage, traceability, and monitoring (metadata, evaluations, drift/quality signals) to support risk management and regulatory needs.
  • Build scalable microservices and event-driven pipelines for model training/inference using Akka Streams and Cluster Sharding.
  • Integrate AdaptiveML workflows for continuous/online learning, feature stores, model registries, and A/B experimentation.
  • Develop reusable reference patterns, inner-source components that meet reliability, security, and compliance standards.
  • Implement shared runtimes for multi agent coordination, state management, memory persistence, and messaging.
  • Design interoperable APIs/SDKs used by data scientists and developers to build agent powered applications.
  • Maintain and improve CI/CD pipelines and developer toolchains for AI services to enable rapid, compliant delivery.
  • Evaluate emerging AI/ML infrastructure capabilities; prototype and introduce tools that improve developer productivity and reliability.
  • Develop and operate scalable backend services supporting high traffic agent interactions, retrieval operations, and real time execution flows.
  • Use cloud native technologies (containers, orchestration, IaC, CI/CD) to deliver reliable, cost-efficient services.
  • Optimize runtime performance across CPU/GPU/accelerator workloads.
  • Monitors and resolves persistent platform issues when surfaced by technical support teams such as bottlenecks, connectivity problems, and system failures.
  • Considers compliance and regulatory requirements throughout the platform lifecycle.
  • Implements security measures, such as access controls, encryptions, and vulnerability assessments when applicable.
  • Partners with architects and business leaders to design and build robust platforms across all Global AI Platform capability layers.

Benefits

  • health, dental, mental health, vision, short- and long-term disability, life and AD&D insurance coverage, adoption/surrogacy and wellness benefits, and employee/family assistance plans.
  • various retirement savings plans (including pension and a global share ownership plan with employer matching contributions) and financial education and counseling resources.
  • generous paid time off program in Canada includes holidays, vacation, personal, and sick days, and we offer the full range of statutory leaves of absence.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service