You will write, evaluate, and profile specialized compute kernels that run on a custom AI accelerator. This is the critical interface between high-level ML workloads and silicon — your code directly determines how effectively the hardware performs. You'll work closely with the architecture and compiler teams to define the kernel programming model, implement core tensor operations, and drive the performance profiling workflow that validates silicon design decisions.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed