Cartesia-posted 3 months ago
HQ - San Francisco, CA
51-100 employees

Cartesia is on a mission to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. We are pioneering model architectures that will enable continuous processing and reasoning over extensive streams of audio, video, and text. Our founding team, comprised of PhDs from the Stanford AI Lab, invented State Space Models (SSMs) to train efficient, large-scale foundation models. We are looking for an Inference Engineer to help advance our mission of building real-time multimodal intelligence.

  • Design and build low latency, scalable, and reliable model inference and serving stack for our cutting edge foundation models using Transformers, SSMs and hybrid models.
  • Work closely with our research team and product engineers to serve our suite of products in a fast, cost-effective, and reliable manner.
  • Design and build robust inference infrastructure and monitoring for our products.
  • Have significant autonomy to shape our products and directly impact how cutting-edge AI is applied across various devices and applications.
  • Strong engineering skills, comfortable navigating complex codebases and monorepos.
  • An eye for craft and writing clean and maintainable code.
  • Experience building large-scale distributed systems with high demands on performance, reliability, and observability.
  • Technical leadership with the ability to execute and deliver zero-to-one results amidst ambiguity.
  • Experience designing best practices and processes for monitoring and scaling large scale production systems.
  • Background in or experience working on inference pipelines with machine learning and generative models.
  • Experience working in CUDA, Triton or similar.
  • Lunch, dinner and snacks at the office.
  • Fully covered medical, dental, and vision insurance for employees.
  • 401(k).
  • Relocation and immigration support.
  • Your own personal Yoshi.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service