Applied Intuition is seeking a software engineer with deep experience in optimizing ML models and deploying them on production-grade embedded runtime environments. The role involves working across the entire ML framework stack, including PyTorch, JAX, ONNX, TensorRT, CUDA, XLA, and Triton. The engineer will drive ML performance optimization on various technologies for ADAS/AD stacks, targeting deployment on embedded compute platforms. Responsibilities include developing compute usage strategies for efficient model inference, working on model pruning and quantization for memory-constrained platforms, collaborating with ML engineers and software developers, and establishing methodologies for profiling model performance on embedded compute platforms to identify bottlenecks.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
501-1,000 employees