The AGI (Artificial General Intelligence) Computing Lab is dedicated to solving the complex system-level challenges posed by the growing demands of future AI/ML workloads. Our team is committed to designing and developing scalable platforms that can effectively handle the computational and memory requirements of these workloads while minimizing energy consumption and maximizing performance. To achieve this goal, we collaborate closely with both hardware and software engineers to identify and address the unique challenges posed by AI/ML workloads and to explore new computing abstractions that can provide a better balance between the hardware and software components of our systems. Additionally, we continuously conduct research and development in emerging technologies and trends across memory, computing, interconnect, and AI/ML, ensuring that our platforms are always equipped to handle the most demanding workloads of the future. By working together as a dedicated and passionate team, we aim to revolutionize the way AI/ML applications are deployed and executed, ultimately contributing to the advancement of AGI in an affordable and sustainable manner. Join us in our passion to shape the future of computing! This role is offered by the STG group within the AGI Lab as part of DSRA. We are a systems research and engineering team working at the intersection of large language models, accelerator hardware, and high-performance software. Our mission is to design, prototype, and optimize next-generation AI systems through tight hardware–software co-design. Our team works hands-on with cutting-edge accelerator hardware, advanced memory systems, and large-scale distributed AI infrastructure. We develop and optimize the software stack required to maximize performance, efficiency, and scalability for modern and emerging LLM workloads. We are seeking a Senior LLM Systems Performance Engineer to build representative AI environments, characterize emerging workloads, and drive performance analysis for next-generation AI platforms. In this role, you will set up and operate realistic LLM serving and agentic AI environments, collect workload traces and performance data, and develop methodologies to characterize workload behavior. You will analyze system bottlenecks across compute, memory, communication, and scheduling resources, and evaluate how emerging workloads interact with AI accelerator architectures and system infrastructure. The ideal candidate combines hands-on experience building large-scale AI systems with strong performance engineering skills and a solid understanding of AI accelerator architecture. You should be comfortable working across the full stack—from application frameworks and serving systems to runtime software, networking, memory systems, and accelerator hardware. You will work closely with hardware architects, systems engineers, and software researchers to understand the performance implications of emerging workloads such as agentic AI, long-context reasoning, disaggregated inference, and Mixture-of-Experts models. Your analysis will help shape future hardware–software co-design decisions and guide the development of next-generation AI infrastructure.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior