Intern - Vector Compute Architect

Bolt Graphics•Sunnyvale, CA

About The Position

We are looking for an initiative-taking Vector Compute Architect Intern to join our advanced architecture team working on next-generation AI, GPU, and high-performance computer platforms. This role focuses on defining and optimizing vector compute architectures for graphics and scientific computing workloads. The ideal candidate is pursuing an MS or PhD in Computer Engineering, Electrical Engineering, Computer Science, or a related field, with strong background in computer architecture, parallel processing, and performance modeling.

Requirements

Pursuing an MS or PhD in Electrical Engineering, Computer Engineering, Computer Science, or related field.
Proven work experience in (preferably published): Workload characterization and profiling, Performance modeling, Out-of-order data dependency and control, Utilization / occupancy optimization, High-performance architecture design techniques.
Experience with one or more of the following: CPU/GPU/NPU architectures, NoC/interconnect architectures, Cache coherency protocols (CHI/ACE/CXL), High-speed interfaces (PCIe, UCIe, Ethernet), Memory systems (DDR, LPDDR, HBM, GDDR), Power, performance, and area optimization.
Strong knowledge of RTL development and verification methodologies.
Experience with architecture modeling and performance analysis tools.
Familiarity with firmware/software interaction in complex SoC systems.
Excellent problem-solving, communication, and leadership skills.

Nice To Haves

Strong communication and technical presentation skills.
Self-driven and capable of working in a fast-paced startup or research-oriented environment.
Passion for AI, GPUs, high-performance computing, and advanced semiconductor technologies.

Responsibilities

Define data parallel microarchitecture satisfying ISA constraints.
Drive architecture tradeoff analysis for performance, power, area, bandwidth, latency, and scalability.
Develop and review system architecture specifications, interface definitions, and microarchitecture requirements.
Collaborate with RTL, verification, physical design, firmware, software, and system teams throughout the development cycle.
Lead performance modeling, workload analysis, and bottleneck identification using C/C++/SystemC or similar modeling environments.
Define memory hierarchy, coherency architecture, and cache structures.
Work closely with verification teams to define architectural test plans and validation strategies.
Support silicon bring-up, debug, performance tuning, and post-silicon optimization.
Contribute to long-term technology and product roadmap planning.

Benefits

Hands-on experience working on cutting-edge vector compute.
Opportunity to collaborate with experienced architecture, software, firmware, and silicon design teams.
Exposure to full-chip architecture trade-offs in advanced process technologies.
Real-world experience in next-generation GPU system development.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume