Senior Storage Systems Engineer

CrusoeSan Francisco, CA

About The Position

At Crusoe, we are on a mission to align the future of computing with the future of the climate. As a Senior Storage Systems Engineer, you will be the primary operator of our high-performance data layer. This role focuses on the availability, scaling, and operational excellence of our all-flash storage ecosystems—specifically VAST Data or Pure Storage, ensuring they deliver the sub-millisecond latency required for world-class AI training and HPC workloads. You will lead the day-to-day administration of our global storage footprint, serving as the subject matter expert for our flash-based platforms. Your work ensures that our sustainable GPU clusters have the reliable, high-throughput data backbone needed to power the AI revolution.

Requirements

  • 5–8+ years of experience in Storage Administration, with at least 3+ years of hands-on experience managing VAST Data or Pure Storage in a production environment.
  • Deep understanding of NFS over RDMA, SMB, and NVMe-oF, and how they are implemented within VAST and Pure architectures.
  • Strong command of the Linux CLI, specifically for mounting, tuning, and troubleshooting high-performance file systems.
  • Understanding of how storage interacts with InfiniBand and RoCE fabrics to ensure low-latency data delivery to GPU nodes.
  • Proficiency in Python, Bash, or similar for automating volume creation, quota management, and reporting via storage APIs.
  • A meticulous approach to capacity planning and documentation, ensuring the environment remains stable as we add petabytes of scale.

Nice To Haves

  • Experience with Pure1 or VAST VMS/Insight for predictive analytics and capacity forecasting.
  • Familiarity with Slurm or Kubernetes (CSI) integration with high-performance storage.
  • Prior experience in a "Large Scale" environment (multi-petabyte footprints).

Responsibilities

  • Own the end-to-end management of VAST Data (Universal Storage) and Pure Storage (FlashBlade/FlashArray) environments, including initial setup, volume provisioning, and export management.
  • Proactively monitor VAST and Pure clusters for IOPS, throughput, and latency bottlenecks, ensuring storage performance stays ahead of GPU demand.
  • Execute software upgrades (Purity//FB, VAST OS), expansion of D-Nodes/C-Nodes, and hardware refreshes with zero downtime for our AI customers.
  • Manage snapshots, replication policies, and data reduction (deduplication/compression) strategies to optimize TCO while ensuring 100% data durability.
  • Act as the lead technical point of contact for storage incidents, working directly with VAST and Pure support engineering to resolve complex fabric or metadata issues.
  • Use APIs (REST, Python) to automate provisioning and integrate storage health metrics into our centralized observability stack (Grafana/Prometheus).

Benefits

  • Competitive compensation and equity packages
  • Restricted Stock Units
  • Paid time off, paid holidays & leave of absence programs
  • Comprehensive health, dental & vision insurance
  • Employer contributions to HSA account
  • Paid parental leave
  • Paid life insurance, short-term and long-term disability
  • Professional development & tuition reimbursement
  • Mental health & wellness support
  • Commuter benefits (parking & transit)
  • Cell phone stipend
  • 401(k) Retirement plan with company match up to 4% of salary
  • Volunteer time off
  • Global travel insurance & emergency assistance
  • Daily meals allowance
  • Additional perks & programs specific to location
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service