Member of Technical Staff - Microarchitect / RTL Design

Architect•Palo Alto, CA

About The Position

Architect is a frontier AI lab for chip design. We build AI models and tools for on-demand custom ASICs at scale. Our goal is to co-design custom ASICs alongside evolving ML workloads, and enable a new era of domain-specific chips that unlock capabilities impossible with current hardware paradigms. Born out of Stanford Research, our team blends AI with Silicon with a founding team from Anthropic, Google DeepMind, Meta SuperIntelligence, xAI, Apple and Intel.

Requirements

Bachelor's, Master's, or PhD in Electrical Engineering, Computer Engineering, or a closely related field.
5+ years (10+ preferred) in RTL design with at least one advanced-node tapeout experience.
RTL design experience on specialized HW accelerators, such as SoCs/IPs integrating XPUs (NPU, GPU, AR/VR) or AI/ML accelerators.
Ideally having worked on Apple Neural Engine, Qualcomm Hexagon NPU / AI Engine, Google Edge TPU, AMD XDNA, Samsung NPU, MediaTek APU, NVIDIA DLA blocks, or accelerators at Groq, Cerebras, MatX, d-Matrix, or similar/equivalent.
Clear, synthesizable, lint-clean RTL with strong design habits such as parameterization, modularity, reuse and configurability.
Hands-on experience with block-specific compute datapaths and data movement; such as MAC arrays, vector units, accumulators, on-chip SRAM controllers and arbiters, DMA engines, scratchpad memory management, etc.
Solid grasp of synthesis, timing constraints, clock domain crossings, reset strategies, AMBA protocols (AXI, AHB, APB), power management techniques, etc.
Strong skills in Python for design automation, regression infrastructure, and tooling.
Experience taking a block from RTL through synthesis and working with PD teams on timing/area/power closure.
Ability to lead RTL design efforts and grow into a team lead over time.

Nice To Haves

Low-power design techniques: clock gating, power gating, multi-voltage domains, UPF.
FPGA prototyping experience (ideally Xilinx Vivado/Vitis).
Familiarity with SIMD/VLIW execution pipelines or instruction-driven datapath design.
Experience writing SVA assertions and functional coverage for design-side verification.
Prior IP building and delivery experience on your block-of-expertise, such as DMA controllers, memory subsystems, interconnects, or similar SoC infrastructure blocks.
Track record on research and development on energy-efficient, high-performance HW accelerators on your block-of-expertise.

Responsibilities

Define, drive, and revise the block-level micro-arch specification for one of the fundamental HW accelerator blocks.
Own AI-driven RTL design flow end-to-end (at the frontend): through code generation to incorporating feedback from lint, CDC, synthesis, and timing closure stages for closing the design loop.
Work directly with the principal architect to refine microarchitectural specs, resolve implementation trade-offs, and feed area/timing/power realities back into the architecture and internal AI systems.
Define and maintain interface specifications (e.g. AXI, AXI-Stream, or custom-built) for block- and SS-level integration.
Build and maintain RTL infrastructure for our in-house AI-driven flow: design automation scripts, regression flows, lint/CDC waivers, and integration collateral.
Support DV bring-up with reference models, assertions, test-plans, and architectural documentation for verification closure.
Support and guide our SW and ML experts to revise and improve our in-house AI flow based on your own experience.
Support FPGA prototyping on Xilinx for early functional validation.