Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there. The Opportunity This is a rare chance to sit at the intersection of frontier vision-language models and real-world deployment. You'll own applied post-training work for VLMs end-to-end for some of the world's largest enterprises, while still contributing directly to Liquid's core multimodal model development. Unlike most roles that force a trade-off between customer impact and foundational work, this role gives you both: deep ownership over how vision-language models are adapted, evaluated, and shipped, and a direct line into the evolution of Liquid's multimodal post-training stack. If you care about visual understanding, data quality, evaluation, and making VLMs actually work in production, this is a chance to shape how applied multimodal AI is done at a foundation model company. What We're Looking For We need someone who: Takes ownership: Owns VLM post-training projects end-to-end, from customer requirements through delivery and evaluation. Thinks end-to-end: Can reason across visual data curation, training, alignment, and evaluation as a single system. Is pragmatic: Optimizes for model quality and customer outcomes over publications or theory. Communicates clearly: Can translate between customer needs and internal technical teams, and push back when needed.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed