AI/ML Systems Engineer, (2026 New College Graduate)

GlobalFoundries•Richardson, TX

2d•$72,000 - $124,800•Onsite

About The Position

We are seeking an early-career AI/ML Systems Engineer to deepen our workload analysis and performance modeling capabilities. You will take ownership of workload characterization and hardware mapping studies, contribute to cross-functional architecture discussions, and help define the team's methodology for estimating and validating performance KPIs. This is a high-impact role for someone who wants to sit at the intersection of machine learning, computer architecture, and systems optimization. Essential Responsibilities include: You will independently study AI/ML workloads across the inference and training stack — including CNNs, transformers, recurrent architectures, and emerging model classes — and build quantitative models of their behavior on real and projected hardware. This includes identifying compute, memory bandwidth, and power bottlenecks using techniques like roofline analysis, operational intensity profiling, and bottleneck decomposition. You will work closely with SoC and IP architecture teams to map workload demands to hardware capabilities and feed your findings into discussions around design tradeoffs, ISA extensions, memory subsystem sizing, and on-chip vs. off-chip bandwidth allocation. On the software side, you will engage with compiler and runtime teams to identify where kernel optimization, scheduling, or memory layout changes can close performance gaps. A significant part of the role involves estimation and modeling before silicon is available — building spreadsheet or code-based models that project achievable throughput, latency, and efficiency for candidate architectures, then validating those models against silicon or simulation data. You will communicate findings through written reports, presentations, and design review participation. Clarity and rigor in your technical communication are as important as the analysis itself. Other Responsibilities: Perform all activities in a safe and responsible manner and support all Environmental, Health, Safety & Security requirements and programs.

Requirements

Graduating with Bachelor’s or Master’s in Electrical, Computer Engineering, Computer Science or related field from an accredited degree program.
0-2 years of relevant industry experience in systems engineering, hardware architecture, ML infrastructure, or performance engineering.
At least an overall 3.0 GPA and proven good academic standing.
English (Written & Verbal) language fluency.

Nice To Haves

Exposure to AI compiler toolchains is preferred.
Familiarity with MLIR, IREE, TVM, or similar compilation infrastructure — even at a conceptual level — will help you engage productively with compiler and runtime engineers and understand how graph-level and kernel-level transformations affect the workloads you analyze.
Experience defining or refining performance KPI frameworks.
Prior work on edge or mobile SoC workload characterization.
Hands-on experimentation with MLIR or IREE compilation pipelines.
Knowledge of RISC-V architecture and Vector/Matrix extensions is a strong plus.
Prior related internship or co-op experience.
Demonstrated prior leadership experience in the workplace, school projects, competitions, etc.
Project management skills, i.e. the ability to innovate and execute solutions that matter; the ability to navigate ambiguity.
Strong written and verbal communication skills.
Strong planning & organizational skills.
Strong mathematical reasoning is a firm requirement. You should be able to construct and manipulate analytical performance models from first principles, deriving bandwidth utilization bounds, reasoning about arithmetic intensity across operator types, estimating latency under queuing or pipeline constraints, and interpreting numerical precision effects on model accuracy and hardware efficiency.
The ability to move fluidly between mathematical formulation and engineering intuition is central to doing this job well.
Comfortable writing analysis code in Python and can build clean, reproducible models.
Communicate technical results well in both written and spoken form, and you can hold your own in architecture discussions with specialists on either the hardware or software side.

Responsibilities

Independently study AI/ML workloads across the inference and training stack — including CNNs, transformers, recurrent architectures, and emerging model classes — and build quantitative models of their behavior on real and projected hardware.
Identify compute, memory bandwidth, and power bottlenecks using techniques like roofline analysis, operational intensity profiling, and bottleneck decomposition.
Work closely with SoC and IP architecture teams to map workload demands to hardware capabilities and feed your findings into discussions around design tradeoffs, ISA extensions, memory subsystem sizing, and on-chip vs. off-chip bandwidth allocation.
Engage with compiler and runtime teams to identify where kernel optimization, scheduling, or memory layout changes can close performance gaps.
Build spreadsheet or code-based models that project achievable throughput, latency, and efficiency for candidate architectures, then validating those models against silicon or simulation data.
Communicate findings through written reports, presentations, and design review participation.
Perform all activities in a safe and responsible manner and support all Environmental, Health, Safety & Security requirements and programs.