We are hiring a Founding ML Infrastructure Engineer to own the end-to-end deployment, optimization, and operation of our suits of models in production. This is a core founding role focused on building and operating production-grade LLM systems. You will apply deep knowledge of model internals to deploy, optimize, and run modern LLMs at scale, owning performance end-to-end across latency, throughput, and reliability. You will design and operate the full ML serving stack from model artifacts to GPU execution, and work closely with Product and ML teams to ensure our models can support high QPS, strict SLAs, and production correctness. This role is ideal for someone who deeply understands how LLMs work internally, but chooses to specialize in making them fast, stable, and production-ready. About Realm Labs Realm Labs is an AI trust and security startup. We help enterprises detect, debug, and prevent AI’s misbehaviors in production. We are backed by top VCs and serve some of the most iconic global enterprises.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed