GenAI Optimization Technical Lead

Modular•Los Altos, CA

48d•$234,000 - $286,000•Hybrid

About The Position

At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges. If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value. About the role: ML developers today face significant friction in taking trained models into deployment. They work in a highly fragmented space, with incomplete and patchwork solutions that require significant performance tuning and non-generalizable/ model-specific enhancements. At Modular, we are building the next generation AI platform (MAX) that will radically improve the way developers build and deploy AI models. We’re continuously working to improve the performance and scalability of MAX by extending existing features and adding new features for users to try. The Serve Optimizations team is responsible for working cross-functionally across the entire Modular tech stack to implement cutting edge optimizations and research for auto-regressive text generation, image generation, and beyond. Think things like Speculative Decoding, LoRA, Quantization, Chunked Prefill, Distributed Inference, etc. LOCATION: Candidates based in the US or Canada are welcome to apply. You can work in our office in Los Altos, CA or remotely from home. Onboarding for new hires is conducted in-person in our Los Altos, CA office.

Requirements

In-depth knowledge of the Python programming language
6+ years of working experience in Machine Learning, Deep Learning, or Generative AI
Proven track record of designing and delivering high-performance, scalable systems or frameworks for AI inference and serving.
Experience leading technical teams to execute in challenging problem spaces
Experience implementing framework-level performance optimizations for Generative AI use cases
Experience profiling and reducing latency in GenAI applications
Deep interest in machine learning technologies and use cases.
Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture.
CUDA/ROCM/Accelerator Programming and Optimization experience

Nice To Haves

Experience using Machine Learning frameworks like PyTorch, Tensorflow, etc.
Experience with LLVM/MLIR/Compilers
Experience working with distributed/parallel programming models and an understanding of parallel hardware.

Responsibilities

Serve as Technical Leader for a team of cross-functional engineers optimizing Generative AI use cases
Mentor and contribute to the growth of other engineers
Define and drive the technical roadmap for performance architecture for the MAX Serve SDK.
Define, plan, and technically lead major cross-functional projects spanning multiple teams and domains to drive core performance and scalability wins.
Serve as a primary technical liaison, collaborating with subject matter experts across Modular to align on performance features and stack integration
Contribute to the MAX tech stack across multiple languages, including Mojo, Python, and C++.
Monitor latest research channels and identify potential opportunities for the MAX framework.

Benefits

Amazing Team. We are a progressive and agile team with some of the industry’s best engineering and product leaders.
World-class Benefits. In order to attract the best, we need to offer the best. Premier insurance plans, up to 5% 401k matching, flexible paid time off, and more are available to you! Please note that specific benefit packages may vary based on your location.
Competitive Compensation. We offer very strong compensation packages, including stock options. We want people to be focused on their best work and believe in tailoring compensation plans to meet the needs of our workforce.
Team Building Events. We organize regular team onsites and local meetups in Los Altos, CA as well as different cities. Traveling 2-4 times a year is expected for all roles.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume