Quest Global-posted 3 months ago
CA

Support silicon team to design and develop the RTL microarchitecture that integrates charge-domain Compute-in-Memory (CIM) with high-throughput vector compute and custom logic. You will architect and implement compute pipelines, memory controllers, and high-speed IO subsystems (UCIe, Ethernet, UALink), working closely with SoC architects, software teams, and physical design to ensure scalable and efficient integration.

  • Define and implement RTL microarchitecture for CIM-aware vector/SIMD compute pipelines, DMA engines, and data movement fabrics.
  • Integrate compute units with 3D-stacked CIM dies via custom NoCs with performance and timing constraints.
  • Design RTL for high-speed IO blocks including UCIe, UALink, SerDes, and 800G Ethernet MACs.
  • Collaborate with SoC architects to ensure coherency, cache control, and bus protocols across compute and memory domains.
  • Engage with firmware/software teams to define early validation hooks and develop functional testbenches and bring-up diagnostics.
  • Deliver early microarchitecture models and integration stubs to software/firmware teams for pre-silicon testing.
  • Coordinate with P&R and layout teams to ensure floorplan-aware RTL, proper clocking, and IO placement.
  • Good at doing Lint, CDC and RDC checks for RTL modules.
  • 7+ years of RTL design and microarchitecture experience in high-performance SoCs, DSPs, or AI accelerators.
  • Expertise in multi-core compute fabric, pipeline design, and tightly coupled memory subsystems.
  • Proven experience designing and verifying IO logic for UCIe, PCIe, SerDes, or chiplet-based architectures.
  • Strong understanding of SoC-level system architecture, memory hierarchy, coherency protocols, and debug features.
  • Familiarity with early bring-up methodology: RTL-based testbench creation, firmware interface modeling, and diagnostics support.
  • Tools: Verilog/SystemVerilog, UVM (preferred), Synopsys/Cadence synthesis and P&R flow.
  • Exposure to compute-in-memory, near-memory compute, or heterogeneous multi-die integration is highly desirable.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service