As a Software Engineer on the Inference & RL Systems team, you will design and operate the distributed systems that serve our models in production and power large-scale post-training workflows. This role sits at the boundary between model execution and distributed infrastructure. You will work on systems that determine inference latency, throughput, stability, and the reliability of RL and post-training training loops. Magic’s long-context models introduce demanding execution constraints: KV-cache scaling, memory pressure under long sequences, batching trade-offs, long-horizon trajectory rollouts, and sustained throughput under real-world workloads. You will own the infrastructure that makes both production inference and large-scale RL iteration fast and reliable.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed