Software Development Manager, Flow Platform

Autodesk•Montreal, QC

About The Position

Autodesk's Media & Entertainment (M&E) group is redefining the future of entertainment. We empower content creators to inspire, educate, and entertain while investing in our employees to build meaningful careers with us. Autodesk uniquely offers platforms, community, resources, best-in-class tools, and processes that unlock new levels of productivity and creativity in media and entertainment. We are seeking a Software Development Manager to lead our Platform Operations team for Flow, Autodesk’s cloud platform powering next-generation media and entertainment workflows. This is a critical leadership role focused on operational excellence, reliability, and platform readiness at scale. You will lead a team responsible for defining and driving best practices across monitoring, alerting, incident response, CI/CD, and production readiness. You will partner closely with engineering, data, security, and infrastructure teams to ensure Flow services are reliable, scalable, and ready to support high-growth products and AI/ML-driven workflows. This is your opportunity to shape how a modern cloud platform operates at scale—driving reliability, improving developer velocity, and enabling the next generation of AI-powered creative tools.

Requirements

Bachelor’s degree in Computer Science, Software Engineering, or related field
7+ years of software engineering experience, with 2+ years in a technical leadership or management role
Experience operating large-scale distributed systems in production environments
Strong understanding of cloud platforms (AWS, Azure, or GCP) and modern service architectures
Experience with CI/CD pipelines, release engineering, and deployment strategies
Hands-on experience with monitoring, alerting, and observability tools
Proven ability to lead incident response and improve operational processes
Strong communication and collaboration skills across cross-functional teams

Nice To Haves

Background in SRE, DevOps, or platform engineering (strongly preferred)
Experience with SLO/SLI frameworks and reliability engineering practices
Experience supporting high-scale or high-throughput systems
Familiarity with AI/ML or data-intensive platforms
Experience with containerization and orchestration (Docker, Kubernetes)

Responsibilities

Lead and grow a high-performing engineering team focused on platform operations and reliability
Define and drive operational excellence practices across Flow Platform (monitoring, alerting, on-call, incident management)
Define and evolve SLOs, SLIs, and reliability standards across platform services
Own and improve incident response processes, including detection, triage, resolution, and postmortems
Drive adoption of best-in-class CI/CD and release practices to improve deployment safety and developer velocity
Partner with product and engineering teams to ensure production readiness and scalability of new services and features
Lead performance testing, capacity planning, and system optimization initiatives
Build and scale observability capabilities (logging, monitoring, tracing) across distributed systems
Enable and support the rollout of AI & ML services by ensuring operational robustness and scalability
Foster a culture of ownership, accountability, and continuous improvement within the team