This role will be based in Sunnyvale or Mountain View, CA. At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team. Join LinkedIn’s Serving Foundations team and work on one of the most critical layers of our AI platform—powering large-scale model inference across all major AI use cases. This team sits at the center of LinkedIn’s AI stack and is responsible for making our models faster, more efficient, and more scalable at production scale. This is a deeply technical, systems-focused position at the intersection of machine learning, compilers, and hardware. You will work across the full stack—from model graphs and optimization techniques to runtime systems, kernels, and GPU execution—to push the limits of performance and efficiency. You will lead efforts to optimize large-scale inference systems serving billions of requests, driving improvements in latency, throughput, and cost. This includes advancing GPU utilization, designing custom kernel and operator optimizations, improving model efficiency through quantization and compression, and shaping how models are compiled and executed in production environments. As a Sr. Staff engineer, you will operate with a high degree of autonomy and influence, identifying bottlenecks across the system and driving end-to-end solutions across model, runtime, and infrastructure layers. Your work will directly impact how AI systems perform at LinkedIn scale and will help define the future of our AI serving platform.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level