We are seeking experienced Platform Engineers with expertise in MLOps and handling distributed systems, particularly Kubernetes, along with a strong background in managing Multi-GPU, Multi-Node Deep Learning job/inference scheduling. Proficiency in Linux (Ubuntu) systems, the ability to create intricate shell scripts, good proficiency in working with configuration management tools and sufficient understanding of deep learning workflow.