About The Position

Autodesk is transforming the Architecture, Engineering, and Construction (AEC) industry by embedding advanced AI and foundation models into cloud-native platforms such as AutoCAD, Revit, Construction Cloud, and Forma. As a Senior Principal Machine Learning Engineer, you will act as a technical leader and delivery owner for complex, high-impact ML initiatives spanning foundation models, reinforcement learning, data systems, and large-scale ML platforms. You will operate at the intersection of applied research, engineering, and product—setting technical direction while remaining hands-on in the areas of highest complexity and risk. This role is designed for a senior ML tech lead with a proven track record of owning and delivering ML systems at scale, including training and operating models in large, distributed environments.

Requirements

  • Master’s or PhD in a field related to AI/ML such as Computer Science, Mathematics, Statistics, Physics, Computational Linguistics, or related disciplines
  • 10+ years of experience in machine learning, AI, or related fields, with a proven track record of technical leadership and hands-on implementation
  • Demonstrated experience mentoring engineers and leading technical projects in cross-functional environments
  • Proven history of leading the delivery of large-scale ML systems from conception to production
  • Expert-level understanding of deep learning architectures (Transformers, Diffusion models) and modern frameworks (PyTorch is required)
  • Hands-on experience with distributed training frameworks and techniques (e.g., PyTorch Distributed, Ray, DeepSpeed, Megatron, CUDA optimization) in HPC or cloud environments (AWS/Azure)
  • Strong proficiency in Python, with an emphasis on performance profiling, debugging, and writing robust, maintainable production code
  • Excellent ability to translate complex technical concepts into clear insights for executive leadership and cross-functional partners

Nice To Haves

  • Experience with large foundation model training in distributed compute environments
  • Experience designing data pipelines for multimodal datasets at the terabyte/petabyte scale (using Spark, Iceberg, etc.)
  • Experience constructing internal developer platforms for ML, utilizing tools like Kubernetes, Slurm, or Metaflow
  • A portfolio demonstrating the successful translation of academic research papers into tangible product features
  • Background in AEC, computational geometry, or experience working with 3D data representations (BIM, CAD, meshes, point clouds)

Responsibilities

  • Technical Strategy & Leadership: Define the long-term technical vision for Generative AI and Foundation Model infrastructure within the AEC Solutions team. Influence architectural decisions across the broader organization.
  • End-to-End Delivery: Lead the design, development, and delivery of complex ML systems. Own the full lifecycle from model architecture selection and data strategy to distributed training and production deployment.
  • Foundation Model Engineering: Drive the development of large-scale training pipelines. Collaborate with Research Scientists to translate experimental ideas (custom architectures, novel loss functions) into scalable, performant code.
  • Scalability & Infrastructure: Architect solutions for distributed training (e.g., FSDP, Megatron-LM, DeepSpeed) on massive compute clusters. Identify and resolve bottlenecks in data processing and model parallelism to maximize training throughput.
  • Mentorship & Influence: Mentor Principal and Senior engineers, fostering a culture of technical ownership, rigorous experimentation, and best practices. Act as a technical partner to Product Management and Engineering leadership.
  • Cross-Functional Collaboration: Partner effectively with Data Engineering, Platform, and Research teams to integrate large-scale multimodal AEC data (3D geometry, images, text) into model development workflows.
  • Operational Excellence: Establish standards for model evaluation, versioning, monitoring, and MLOps best practices to ensure reproducibility and reliability in a high-stakes production environment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service