Member of Technical Staff - Efficient ML

Moonlake•San Mateo, CA

6d•Onsite

About The Position

Introducing Moonlake, AI for creating world simulations. Scope of Work Training efficiency Dataloaders, fusion, activation remat, gradient checkpointing. FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight profiling, Triton/CUDA kernels, fused ops. Flash-attention–style speedups, sequence packing, KV-cache tricks. Inference optimization Low-latency serving, continuous batching, speculative decoding. Quantization (GPTQ/AWQ), distillation, pruning. Infra + reliability SLURM/K8s multi-node jobs, checkpoint hygiene. Determinism, env pinning, GPU failure handling. We are committed to being an on-site, in-person team currently based in San Mateo

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume