• Expand our in-house analytical GPU framework to make it amenable to workload profiling and validate the updates with Gainsight • Identify opportunities for eDRAM (including refresh-free operation) according to data types (activation, KV cache, weights) and a set of relevant workloads for multi-GPU (single server) inference • Identify and contrast the opportunities for LLM prefill and decode separately, considering a long context length • Benchmark various eDRAM options (1T1C BEOL, 2TGC hybrid, 2TGC BEOL) vs. the SRAM baseline with respect to workload-level inference energy and latency at iso-area and iso-capacity • Some additional questions: • How can we modify the GPU architecture to make the best out of eDRAM? • Which AI workloads would benefit the most from eDRAM? • What are the architectural and algorithmic options to maximize a refresh-free operation? • How much benefit would eDRAM bring into AI training?

  • Expand our in-house analytical GPU framework to make it amenable to workload profiling and validate the updates with Gainsight
  • Identify opportunities for eDRAM (including refresh-free operation) according to data types (activation, KV cache, weights) and a set of relevant workloads for multi-GPU (single server) inference
  • Identify and contrast the opportunities for LLM prefill and decode separately, considering a long context length
  • Benchmark various eDRAM options (1T1C BEOL, 2TGC hybrid, 2TGC BEOL) vs. the SRAM baseline with respect to workload-level inference energy and latency at iso-area and iso-capacity
  • Determine how to modify the GPU architecture to make the best out of eDRAM
  • Determine which AI workloads would benefit the most from eDRAM
  • Determine the architectural and algorithmic options to maximize a refresh-free operation
  • Determine how much benefit eDRAM would bring into AI training
  • Ph.D. Student in Electrical Engineering or Computer Science
  • Experience on: GPU architecture and simulations
  • Experience on: GPU latency and energy evaluations
  • Experience on: Embedded DRAM options
  • Experience on: Generative AI workloads
  • Experience on: AI workload profiling
  • Memory array design
  • Understanding of 3D integration schemes
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service