We are looking for an ambitious and talented individual who is keen on applying their skills to real-life AI infrastructural issues. In this role, you will have the opportunity to contribute to the building of a dynamic resource allocation system designed to enhance efficiency and productivity. This project is key to eliminating resource contention and optimizing our cloud infrastructure costs. The goal here is to ensure development VMs are provisioned and consumed as needed, based on the lifecycle defined by the user. Beyond system efficiency gains, this project will increase user productivity by eliminating resource access bottlenecks, allowing engineers to instantly provision machines for every task, streamlining workflows, and accelerating project completion. About the Work Develop a system to provide users with GPU VMs for their development environment. Create a dynamic VM allocation mechanism integrated into a shared Google Kubernetes Engine (GKE) resource pool. Integrate into our in-house ML Scheduler for VM provisioning and lifecycle management.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Career Level
Intern