Senior Inference Engineer - AI

Thomson ReutersToronto, ON
$110,000 - $204,200Hybrid

About The Position

Thomson Reuters is seeking a Senior Inference Engineer, AI. This position is open due to an existing vacancy to support our evolving business needs. The role involves collaborating with platform teams to enhance capacity forecasting for AI workloads and working with Product, Data Science, Architecture, and Enterprise AI teams to onboard new research models into production. Within Platform Engineering and Enterprise AI Services, an AI Inference Engineer is responsible for productionizing, optimizing, and scaling AI and LLM workloads that power TR’s AI driven products. This role ensures that our trained models—from classical ML to generative AI—run efficiently across TR’s multi cloud footprint (AWS, Azure, GCP, OCI), meet strict enterprise reliability requirements, and integrate seamlessly with our data backbone (Snowflake, OpenSearch vector search, API managed model routing). The successful candidate will help build the next generation of TR’s AI infrastructure, working alongside cloud engineering, data engineering, product teams, and AI Services.

Requirements

  • 5+ years of relevant experience
  • Strong understanding of ML/LLM fundamentals and inference optimization techniques.
  • Hands-on experience with GPU programming (CUDA preferred), inference runtimes (TensorRT, ONNX Runtime), and deep learning frameworks (PyTorch/TensorFlow)
  • Proficiency in Python and at least one systems language (C++ strongly preferred for performance critical inference paths)
  • Experience deploying AI workloads to AWS/GCP/Azure and Kubernetes
  • Familiarity with vector search systems (OpenSearch vectors) and retrieval augmented generation pipelines
  • Knowledge of distributed systems, microservices, CI/CD, and cloud native architecture

Responsibilities

  • Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning
  • Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours
  • Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic
  • Integrate models into production grade APIs supporting TR products and enterprise workflows.
  • Develop highly optimized environment and eliminate performance bottlenecks to reduce latency.
  • Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI)
  • Build and optimize containerized inference pipelines using Kubernetes for large-scale distributed workloads
  • Ensure compliance with TR’s AI standards for deployment, monitoring, governance, and drift detection
  • Profile inference performance, identify GPU/CPU bottlenecks, and optimize compute utilization across heterogeneous hardware
  • Implement observability and health monitoring for inference pipelines, ensuring reliability of enterprise AI services
  • Collaborates closely with AI engineers to invent new quantization techniques, improve numerical precision, and explore non‑standard architectures, and support the scale out of AI infrastructure during critical releases and global product rollouts
  • Partner with Cloud Engineers (Azure, AWS, GCP) to develop guardrails and automation that support inference workloads

Benefits

  • Flexible hybrid working environment (2-3 days a week in the office depending on the role)
  • Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset.
  • Work from anywhere for up to 8 weeks per year
  • Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow’s challenges and deliver real-world solutions. Our Grow My Way programming and skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.
  • Industry Competitive Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.
  • Globally recognized, award-winning reputation for inclusion and belonging, flexibility, work-life balance, and more.
  • Social Impact: Make an impact in your community with our Social Impact Institute. We offer employees two paid volunteer days off annually and opportunities to get involved with pro-bono consulting projects and Environmental, Social, and Governance (ESG) initiatives.
  • Market competitive health, dental, vision, disability, and life insurance programs
  • Competitive 401k plan with company match
  • Competitive vacation, sick and safe paid time off, paid holidays (including two company mental health days off), parental leave, sabbatical leave.
  • Optional hospital, accident and sickness insurance paid 100% by the employee
  • Optional life and AD&D insurance paid 100% by the employee
  • Flexible Spending and Health Savings Accounts
  • Fitness reimbursement
  • Access to Employee Assistance Program
  • Group Legal Identity Theft Protection benefit paid 100% by employee
  • Access to 529 Plan
  • Commuter benefits
  • Adoption & Surrogacy Assistance
  • Tuition Reimbursement
  • Access to Employee Stock Purchase Plan

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service