Systems Engineer, HPC

Mistral Ai

About The Position

At Mistral AI, we build high-performance, open, and efficient AI systems designed to power the next generation of applications. Our infrastructure combines large-scale distributed systems, cloud platforms, and HPC environments to support cutting-edge research and production workloads. We are a collaborative, low-ego, and highly technical team, operating across Europe, the US, and beyond. As we scale rapidly, we are building the foundational infrastructure to support thousands of nodes and petabyte-scale systems. Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers . About the Role We are looking for Systems Engineers / System Administrators to help design, operate, and scale the infrastructure behind Mistral’s AI platforms. This is a hands-on, hybrid role combining: Systems administration (operating and troubleshooting large-scale Linux environments) Systems engineering (automation, scalability, and performance improvements) You’ll work closely with infrastructure, HPC, and research teams to ensure our clusters and platforms run reliably at scale.

Requirements

Strong Linux systems administration experience (core requirement)
Experience working in large-scale environments: HPC clusters or cloud infrastructure
Experience with Job schedulers (e.g. Slurm)
Solid troubleshooting skills across systems, hardware, and networks
Pragmatic problem solver who can operate in fast-scaling environments
Comfortable working across multiple domains (“Swiss army knife” mindset)
Able to go deep in one area while learning others
Low-ego, collaborative, and hands-on

Nice To Haves

Containers / orchestration (e.g. Kubernetes)
Storage systems (e.g. Ceph, Lustre, NFS)
Networking fundamentals (Ethernet; InfiniBand is a plus)
Infrastructure as Code / automation tooling
GPU or AI/ML experience

Responsibilities

Operate and maintain large-scale Linux environments (bare metal, clusters, cloud)
Monitor system health, troubleshoot incidents, and ensure high availability
Support production and research workloads across multiple environments
Help scale clusters toward hundreds to thousands of nodes
Work on systems handling petabyte-scale storage
Improve performance, reliability, and resource utilisation
Automate operational tasks using tools like Python, Bash, Ansible, or Terraform
Improve deployment, provisioning, and system lifecycle management
Contribute to system design and architecture decisions
Work closely with: HPC / infrastructure teams
Platform / DevOps engineers
Research teams
Act as a bridge between users and infrastructure

Benefits

Impact: Play a pivotal role in scaling Mistral’s cutting-edge AI infrastructure.
Growth: Opportunity to shape data centre operations from the ground up in a high-growth startup environment.
Collaboration: Work with a talented, cross-functional team passionate about AI and technology.
Flexibility: Competitive compensation, benefits, and the chance to contribute to revolutionary projects.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume