About The Position

We are looking for experienced engineers to help build and scale next-generation AI infrastructure using PyTorch, one of the world’s most widely used deep learning frameworks. This role sits at the intersection of machine learning systems, compilers, and high-performance computing, enabling researchers and product teams to train and deploy large-scale models efficiently. You will work on core components of the PyTorch ecosystem, including model execution, distributed training, performance optimization, and developer experience.

Requirements

  • PhD or MSc degree in Computer Science, Applied Math, Physics, or related science or engineering field (or equivalent experience)
  • 8+ years of software development experience
  • Strong programming skills in C++ and Python
  • Deep understanding of deep learning frameworks, preferably PyTorch
  • Experience with GPU programming (CUDA or similar) and performance optimization

Nice To Haves

  • Contributions to PyTorch core or ecosystem libraries
  • Experience with NVIDIA AI stack (TensorRT, Triton Inference Server, cuBLAS, cuDNN, NCCL)
  • Familiarity with ML compilers (TorchInductor, Triton, XLA, TVM)
  • Experience optimizing LLMs or large-scale recommendation / vision models
  • Background working closely with hardware-aware software optimization

Responsibilities

  • Design and build core PyTorch capabilities across runtime, autograd, distributed training, and model execution
  • Optimize performance across GPU/accelerator backends (CUDA, Triton, etc.)
  • Contribute to or lead development of large-scale ML systems and infrastructure
  • Improve model training efficiency, scalability, and reliability across multi-node environments
  • Work on compilers / graph transformations / kernel optimizations to accelerate deep learning workloads
  • Partner with researchers and applied teams to translate cutting-edge models into production systems
  • Drive open-source contributions and collaborate with the broader PyTorch community
  • Influence roadmap and architecture for next-gen AI platforms
  • Work at the forefront of AI and accelerated computing
  • Direct impact on how PyTorch runs on the world’s most advanced GPU platforms
  • Collaborate across hardware, systems software, and AI research to push performance boundaries and enable breakthroughs in generative AI, autonomous systems, and high-performance computing

Benefits

  • Competitive salaries
  • Generous benefits package
  • Equity

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Number of Employees

5,001-10,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service