SambaNova Systems-posted 10 months ago
$200,000 - $250,000/Yr
Full-time • Senior
Palo Alto, CA

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale. SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets. We’re seeking a Lead Architect, Runtime to join our talented Runtime team—a group of engineers who have a proven track record of building software that directly powers advanced AI workloads and scientific computing. As a key technical leader, you will be responsible for designing and architecting a high-performance, distributed, and scalable software runtime that supports our broad array of data-flow applications, including machine learning training and inference, data processing pipelines (ETL), and HPC applications. In this role, you will have the opportunity to define and deliver the architecture of our entire runtime stack, driving everything from OS-level integration to performance profiling, networking, and optimization, while working closely with hardware teams to design the most efficient systems.

  • Lead the design, development, and performance optimization of the software runtime stack, ensuring it meets the high-performance and scalability requirements of ML, AI, and HPC applications.
  • Architect embedded software infrastructure to enable smooth integration of high-level applications with the underlying hardware, including OS interface/integration, partitioned workload orchestration, fault management, and inter-node communication.
  • Oversee and guide the low-level integration between software and hardware components, ensuring efficient chipset initialization, monitoring, and fault management.
  • Drive the technical direction for the Runtime Engineering team, ensuring the design and implementation of software that delivers performance and scales efficiently with our next-generation AI hardware and platforms based on our Reconfigurable Dataflow Architecture.
  • Lead the design and development of tools and performance profilers, empowering customers to configure, deploy, and optimize their workloads on SambaNova’s Datascale systems.
  • Inspire and guide the team to continuously improve development processes, coding standards, and collaboration practices. Foster a culture of excellence, accountability, and technical growth.
  • Collaborate with hardware, software, and product teams to define requirements and ensure seamless integration between hardware and system software components.
  • Proven experience building, testing, and tuning software for distributed, high-performance systems.
  • In-depth knowledge of operating systems and runtime stacks.
  • Hands-on experience with Real-Time Operating Systems (RTOS) and system-level software that directly interfaces with hardware.
  • Expertise in designing and optimizing systems that handle massive parallel workloads, including machine learning training and inference tasks that involve billions of operations per second.
  • Deep understanding of hardware-software interaction, including registers, device memory management, and the intricacies of accelerator design.
  • Familiarity with distributed systems architecture, including networking, communication protocols, and the challenges of scaling compute resources efficiently.
  • Hands-on experience with software development tools such as Git, Jenkins, and Jira, with an ability to drive automation and continuous integration efforts.
  • Ability to work at the intersection of hardware and software, designing systems that optimize both performance and reliability.
  • Experience designing or working closely with custom hardware accelerators (ASICs, FPGAs, etc.) and understanding low-level interactions.
  • Familiarity with deploying high-performance systems in distributed, cloud, or data center environments.
  • Competitive total rewards package, including base salary, equity, and benefits.
  • 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents.
  • Health Savings Account (HSA) with employer contribution.
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans.
  • Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care.
  • Well-being benefits including a full subscription to Headspace, Gympass+ membership, One Medical membership, and counseling services.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service