ML Solution Architect (Early Talent)

Nebius
$102 - $126Remote

About The Position

Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D. The role We're looking for an ML Solutions Architect (Early Career) to join the team behind Nebius Token Factory's serverless inference and fine-tuning platform for open-source LLMs. Working alongside senior Solutions Architects, you'll take on real technical work – building and testing LLM-based solutions, benchmarking, and inference optimization – and learn how scalable AI applications are built and tuned on our platform, in close collaboration with our backend team. This is a hands-on learning role with close mentorship from senior SAs. Strong performers will be considered for a full-time Solutions Architect position at the end of the program. This is a paid temporary contract, open to students and recent graduates. You're welcome to work remotely from any timezone.

Requirements

  • Currently pursuing or recently completed a BSc/MSc/PhD in Computer Science, Machine Learning, or a related field.
  • Strong Python programming skills.
  • Hands-on generative AI experience, including with common ML frameworks (e.g., PyTorch, Transformers).
  • Strong communication skills, with a willingness to explain technical concepts to diverse audiences.
  • Permitted to work in the job’s location.

Nice To Haves

  • Experience deploying/serving LLMs with vLLM, SGLang, or TensorRT-LLM.
  • Familiarity with inference optimization techniques such as quantization, batching, caching, and routing.
  • Knowledge of model architectures and fine-tuning approaches.
  • Contributions to open-source ML/AI projects.

Responsibilities

  • Help build and test LLM-based solutions and applications using Token Factory's inference services, including multimodal models (text, vision, audio).
  • Assist senior SAs with prompt engineering, model selection, benchmarking, and inference optimization.
  • Run performance and quality experiments to support proof-of-concept work.
  • Contribute to internal tooling and automation that improves how the SA team delivers.

Benefits

  • 100% company-paid medical, dental, and vision coverage for employees and families.
  • Up to 4% company match with immediate vesting.
  • 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
  • Up to $85/month for mobile and internet.
  • Company-paid short-term, long-term and life insurance coverage.
  • Competitive compensation
  • Career growth and learning opportunities
  • Flexibility and ownership
  • Collaborative and innovative culture
  • Opportunity to work on impactful AI projects
  • International environment and talented teams
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service