About the position
We are seeking an ML Infrastructure Engineer to join our team at Runway. In this role, you will be responsible for scaling our infrastructure and tooling for the development, testing, and deployment of our machine learning products. The ideal candidate should have experience in provisioning large compute clusters for machine learning workflows, supporting teams in creating best practices for reliability and scalability, and thrive in a fast-paced, high-ownership environment. You will have the opportunity to manage compute clusters, create tooling and infrastructure, and build automation and CI/CD pipelines for developing and deploying new machine learning models.
Responsibilities
- Manage large compute clusters for ML training, inference, and development
- Create tooling and infrastructure that abstract compute and storage in ML development workflows
- Build automation and CI/CD pipelines for developing and deploying new machine learning models
- Have 3+ years of experience in a DevOps or Infrastructure Engineer role building machine learning infrastructure and working with large GPU clusters
- Have knowledge of cloud providers such as AWS, GCP, or Azure, infrastructure-as-code frameworks such as Terraform, observability tools such as Grafana
- Have interest and experience supporting engineering teams in creating robust processes for automation, reliability, and instrumentation
- Have strong communication, collaboration, and documentation skills
Requirements
- 3+ years of experience in a DevOps or Infrastructure Engineer role building machine learning infrastructure and working with large GPU clusters
- Knowledge of cloud providers such as AWS, GCP, or Azure, infrastructure-as-code frameworks such as Terraform, observability tools such as Grafana
- Interest and experience supporting engineering teams in creating robust processes for automation, reliability, and instrumentation
- Strong communication, collaboration, and documentation skills
Benefits
- Competitive salary range of $150,000-$200,000
- Opportunity to work with a small and growing team of creative and entrepreneurial individuals
- Access to cutting-edge technology and tools for video and content creation
- Chance to work on machine learning-based products and infrastructure
- Supportive and inclusive work environment that values diversity
- Opportunity for professional growth and development
- Emphasis on automation, reliability, and instrumentation in engineering processes
- Collaboration with top-tier investors and industry professionals
- Opportunity to contribute to pushing the boundaries of creativity and storytelling
- Equal opportunity for success regardless of race, gender identity, sexual orientation, religion, origin, ability, or age