About The Position

At Mistral AI, we build high-performance, open, and efficient AI systems designed to power the next generation of applications. Our infrastructure combines large-scale distributed systems, cloud platforms, and HPC environments to support cutting-edge research and production workloads. We are a collaborative, low-ego, and highly technical team, operating across Europe, the US, and beyond. As we scale rapidly, we are building the foundational infrastructure to support thousands of nodes and petabyte-scale systems. Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers . About the Role We are looking for Systems Engineers / System Administrators to help design, operate, and scale the infrastructure behind Mistral’s AI platforms. This is a hands-on, hybrid role combining: Systems administration (operating and troubleshooting large-scale Linux environments) Systems engineering (automation, scalability, and performance improvements) You’ll work closely with infrastructure, HPC, and research teams to ensure our clusters and platforms run reliably at scale.

Requirements

  • Strong Linux systems administration experience (core requirement)
  • Experience working in large-scale environments: HPC clusters or cloud infrastructure
  • Experience with Job schedulers (e.g. Slurm)
  • Solid troubleshooting skills across systems, hardware, and networks
  • Pragmatic problem solver who can operate in fast-scaling environments
  • Comfortable working across multiple domains (“Swiss army knife” mindset)
  • Able to go deep in one area while learning others
  • Low-ego, collaborative, and hands-on

Nice To Haves

  • Containers / orchestration (e.g. Kubernetes)
  • Storage systems (e.g. Ceph, Lustre, NFS)
  • Networking fundamentals (Ethernet; InfiniBand is a plus)
  • Infrastructure as Code / automation tooling
  • GPU or AI/ML experience

Responsibilities

  • Operate and maintain large-scale Linux environments (bare metal, clusters, cloud)
  • Monitor system health, troubleshoot incidents, and ensure high availability
  • Support production and research workloads across multiple environments
  • Help scale clusters toward hundreds to thousands of nodes
  • Work on systems handling petabyte-scale storage
  • Improve performance, reliability, and resource utilisation
  • Automate operational tasks using tools like Python, Bash, Ansible, or Terraform
  • Improve deployment, provisioning, and system lifecycle management
  • Contribute to system design and architecture decisions
  • Work closely with: HPC / infrastructure teams
  • Platform / DevOps engineers
  • Research teams
  • Act as a bridge between users and infrastructure

Benefits

  • Impact: Play a pivotal role in scaling Mistral’s cutting-edge AI infrastructure.
  • Growth: Opportunity to shape data centre operations from the ground up in a high-growth startup environment.
  • Collaboration: Work with a talented, cross-functional team passionate about AI and technology.
  • Flexibility: Competitive compensation, benefits, and the chance to contribute to revolutionary projects.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service