webAI-posted 3 months ago
Full-time • Senior
Austin, TX
1-10 employees

webAI is pioneering the future of artificial intelligence by establishing the first distributed AI infrastructure dedicated to personalized AI. We recognize the evolving demands of a data-driven society for scalability and flexibility, and we firmly believe that the future of AI lies in distributed processing at the edge, bringing computation closer to the source of data generation. Our mission is to build a future where a company's valuable data and intellectual property remain entirely private, enabling the deployment of large-scale AI models directly on standard consumer hardware without compromising the information embedded within those models. We are developing an end-to-end platform that is secure, scalable, and fully under the control of our users, empowering enterprises with AI that understands their unique business. We are a team driven by truth, ownership, tenacity, and humility, and we seek individuals who resonate with these core values and are passionate about shaping the next generation of AI.

  • Implement and optimize advanced ML architectures (Transformers, Mixture of Experts, Diffusion models) in C++, with a focus on performance and memory efficiency.
  • Develop and fine-tune custom Metal kernels for performance-critical inference operations.
  • Apply advanced model quantization techniques (low-bit, mixed-precision) to accelerate performance while minimizing footprint.
  • Profile, benchmark, and tune inference on Apple Silicon (M-series, A-series), identifying and eliminating bottlenecks.
  • Collaborate on API design and build Python bindings for C++ libraries.
  • Contribute to robust testing frameworks to ensure reliability and performance.
  • Bachelor’s degree in CS, EE, or a related field, or equivalent experience.
  • 4+ years of professional experience in C++ systems programming.
  • Strong understanding of computer architecture, data structures, and algorithms.
  • Demonstrated experience with performance profiling and low-level optimization.
  • Familiarity with deep learning concepts and architectures (Transformers, Diffusion models, Mixture of Experts).
  • Deep expertise with Apple’s MLX framework.
  • Demonstrable experience writing and optimizing custom Metal kernels.
  • Experience with model quantization techniques and their performance implications.
  • Familiarity with the iOS/macOS development ecosystem and build systems (CMake).
  • Experience creating Python bindings for C++ libraries.
  • Competitive salary and performance-based incentives.
  • Comprehensive health, dental, and vision benefits package.
  • 401k Match (US-based only)
  • $200/mos Health and Wellness Stipend
  • $400/year Continuing Education Credit
  • $500/year Function Health subscription (US-based only)
  • Free parking, for in-office employees
  • Unlimited Approved PTO
  • Parental Leave for Eligible Employees
  • Supplemental Life Insurance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service