Lead Full Stack Software Engineer

Manulife•Toronto, ON

4d•$113,260 - $210,340•Hybrid

About The Position

The Lead Full Stack Software Engineer brings deep expertise across both front-end and back-end development, playing a key role in building next-generation products and services. This individual will lead the design, development, testing, and implementation of new features that enable scalable, high-quality software solutions while setting technical direction and best practices.

Requirements

7–10+ years in software engineering; 3+ years leading teams/projects in AI/ML or distributed systems.
Hands-on experience building or integrating technology supporting AI governance (model governance) and MLOps capabilities for model lifecycle management (registry, approvals, monitoring), orchestration, and continuous learning (e.g., AI Foundry, AdaptiveML, or equivalent).
Proficiency in Scala or Java (Akka ecosystem), plus Python for ML tooling.
Experience with stream processing and data pipelines.
Solid MLOps background: model registries, feature stores, CI/CD for ML, containerization (Docker), orchestration (Kubernetes).
Cloud proficiency (AWS/Azure), Terraform or IaC, and secrets/IAM.
Deep understanding of distributed systems, observability stack and resilience patterns.
Strong communication, documentation, and stakeholder management skills.

Nice To Haves

Experience with online learning, reinforcement learning, or active learning in production.
Knowledge of responsible AI frameworks, model risk management, and fairness/bias assessment.
Performance optimization for low-latency inference; GPU/accelerator utilization.
Experience in regulated industries (e.g., financial services/insurance) with audit and governance requirements.
ModelOp (ModelOps) and Dynamo implementation experience supporting AI governance / model governance and model lifecycle management.

Responsibilities

Designs, builds, and maintains the technology platform's features and infrastructure, including hardware, software, and network components.
Implement and integrate technology solution supporting AI governance (model governance) for orchestration, feature engineering, model deployment controls, approval workflows, and audit-ready evidence capture.
Design and implement policy-as-code guardrails (access, usage, data handling, deployment standards) and enforcement points across the AI lifecycle.
Build capabilities for model lineage, traceability, and monitoring (metadata, evaluations, drift/quality signals) to support risk management and regulatory needs.
Build scalable microservices and event-driven pipelines for model training/inference using Akka Streams and Cluster Sharding.
Integrate AdaptiveML workflows for continuous/online learning, feature stores, model registries, and A/B experimentation.
Develop reusable reference patterns, inner-source components that meet reliability, security, and compliance standards.
Implement shared runtimes for multi agent coordination, state management, memory persistence, and messaging.
Design interoperable APIs/SDKs used by data scientists and developers to build agent powered applications.
Maintain and improve CI/CD pipelines and developer toolchains for AI services to enable rapid, compliant delivery.
Evaluate emerging AI/ML infrastructure capabilities; prototype and introduce tools that improve developer productivity and reliability.
Develop and operate scalable backend services supporting high traffic agent interactions, retrieval operations, and real time execution flows.
Use cloud native technologies (containers, orchestration, IaC, CI/CD) to deliver reliable, cost-efficient services.
Optimize runtime performance across CPU/GPU/accelerator workloads.
Monitors and resolves persistent platform issues when surfaced by technical support teams such as bottlenecks, connectivity problems, and system failures.
Considers compliance and regulatory requirements throughout the platform lifecycle.
Implements security measures, such as access controls, encryptions, and vulnerability assessments when applicable.
Partners with architects and business leaders to design and build robust platforms across all Global AI Platform capability layers.

Benefits

health, dental, mental health, vision, short- and long-term disability, life and AD&D insurance coverage, adoption/surrogacy and wellness benefits, and employee/family assistance plans.
various retirement savings plans (including pension and a global share ownership plan with employer matching contributions) and financial education and counseling resources.
generous paid time off program in Canada includes holidays, vacation, personal, and sick days, and we offer the full range of statutory leaves of absence.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume