In this role, you will lead the definition, development, and delivery of Nebius Token Factory’s inference capabilities, focusing on highly scalable, production-grade machine learning systems. You will be responsible for shaping the direction of our inference platform, driving product decisions that balance performance, reliability, and real-world customer needs. This includes working closely with engineering and research teams to design and optimize real-time and batch inference workflows, supporting customer PoCs, and translating technical challenges into clear product requirements. You will work directly with customers and internal stakeholders to understand ML workflows at scale, identify bottlenecks, and define features that improve latency, throughput, orchestration, and deployment efficiency. You will also guide product adoption by delivering intuitive tools and robust infrastructure that solve complex inference problems across diverse use cases. This role requires a strong technical foundation in ML systems and a product mindset oriented toward execution, clarity, and long-term scalability. You are welcome to work remotely from the US.