System Performance & AI Architect

Majestic LabsLos Altos, CA

About The Position

As a System Architect, you will be responsible for the end-to-end performance simulation of our next-generation AI and Graph-computing platforms. You will perform the quantitative analysis and architectural pathfinding that defines how our systems handle the world’s most complex data-centric workloads, from Trillion-parameter LLMs and Mixture-of-Experts (MoE) to large-scale, irregular Graph Neural Networks (GNNs), and other workloads.

Requirements

  • Education: PhD in Electrical Engineering, Computer Science, or a related field with a focus on Computer Architecture or High-Performance Systems.
  • Experience: 10+ years of experience in performance modeling and system architecture, with a proven track record at major semiconductor or hyper-scale AI organizations.
  • Expertise in Data-Centric Computing: Deep understanding of Instruction Set Architectures (ISA), Cache Coherence, and Memory Consistency.
  • Expertise in modeling Interconnect Topologies and flow control for distributed AI training and inference.
  • Advanced proficiency in Modern C++ and Python for building sophisticated system-level simulators.
  • Profiling Mastery: Hands-on experience with performance analysis tools (e.g., NSight, ROCm, VTune) and developing custom trace-injection tools to correlate silicon behavior with simulation models.
  • Workload Mastery: Demonstrated ability to characterize and optimize for irregular data-flow patterns common in GNNs, LLMs, and Recommendation Systems.
  • Strategic Influence: Experience presenting data-driven architectural recommendations to executive leadership and strategic partners based on a blend of simulation and empirical data.
  • Customer Engagement: Proven ability to translate customer-facing performance requirements into actionable hardware specifications.
  • Technical Mentorship: A history of elevating the technical bar for engineering teams and championing modern, automated engineering workflows.

Responsibilities

  • System-Level Performance Projection: develop and execute high-fidelity, system-level performance models that simulate the interaction between compute clusters, Network-on-Chip (NoC), and advanced memory hierarchies (HBM4, CXL).
  • Empirical Profiling & Characterization: Drive deep-dive performance profiling of existing hardware architectures (GPUs, NPUs, and SoCs). Use hardware counters, trace-based analysis, and telemetry to identify real-world bottlenecks in current silicon that inform future architectural iterations.
  • Workload-Architecture Co-Design: Profile frontier AI models and graph analytics to identify deep-system bottlenecks. Translate high-level algorithmic behaviors (e.g., KV cache growth, sparse matrix traversals) into hardware architectural requirements.
  • Memory Subsystem Innovation: Define the strategy for managing the "Memory Wall," optimizing for bandwidth, latency, and power across complex hierarchies and disaggregated memory pools.
  • Architectural Pathfinding: Evaluate and influence the adoption of emerging system technologies. Conduct trade-off analyses that determine the multi-year roadmap for system topology and scalability.
  • AI-Augmented Engineering: Champion an "AI-first" approach to architecture, utilizing machine learning and automation to accelerate simulation throughput and explore massive design spaces.
  • Cross-Functional Technical Leadership: Serve as a primary bridge between Software/Compiler teams and Hardware Implementation, ensuring architectural specifications meet real-world production constraints.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

Ph.D. or professional degree

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service