Staff Embedded Software Engineer

Relativity Space•Long Beach, CA

47d

About The Position

Own the complete storage platform software stack for a space-based data center: custom Linux kernel drivers, OpenZFS pool design, NFS data serving, and automated fault recovery, shipping a platform that preserves up to a petabyte of mission data through years of radiation exposure Design and implement custom Linux kernel drivers for NVMe fault recovery and GPIO overcurrent protection, working across PCI/PCIe, block layer, and interrupt subsystems to detect and recover from radiation-induced upsets without data loss Lead the ZFS pool topology architectural decisions by building quantitative reliability models that balance upset probability, resilver risk, and capacity over a 6+ year mission, then validate through fault injection testing Develop the integration layer between NVMe controller reset and ZFS, ensuring that a drive recovering from a transient fault re-enters the storage pool cleanly, bridging driver-level recovery with filesystem-level fault tolerance Rapidly prototype on commodity hardware, from first boot through sustained 10 Gbps writes with automated fault recovery, de-risking the architecture before committing to the target platform, then carry the design through integration and launch

Requirements

5+ years writing Linux kernel code, actual driver development involving PCI/PCIe devices, block storage, or interrupt-driven hardware, with meaningful time spent in kernel space
Experience with storage systems: ZFS or other copy-on-write filesystems, RAID, NVMe internals, or high-throughput network storage (e.g., NFS)
Depth in one or more: filesystem internals, block layer / device management, or storage protocol implementation
Strong working knowledge of OS internals: virtual memory, interrupt context constraints, synchronization primitives, and I/O stack behavior

Nice To Haves

Hands-on experience at the driver hardware software boundary: DMA coherency, MMIO semantics, PCIe enumeration, and cache behavior
Strong working knowledge of data structures and systems reasoning for storage (Merkle trees, NVMe submission/completion queue ring buffers, hash tables, radix trees)
Experience testing storage systems, including fault injection (PCIe/NVMe resets, error storms), low-level tracing (ftrace/perf/bpftrace), and crash dump analysis (kdump/vmcore)
Experience designing software recovery around storage hardware fault cases, whether that's storage firmware, autonomous vehicle data systems, large-scale distributed infrastructure, or embedded platforms
Familiarity with embedded Linux build systems (Yocto or Buildroot) and cross-compilation
Hardware lab comfort: serial consoles, logic analyzers, and willingness to debug PCIe enumeration failures on a prototype board alongside the electrical engineers

Responsibilities

Own the complete storage platform software stack for a space-based data center: custom Linux kernel drivers, OpenZFS pool design, NFS data serving, and automated fault recovery, shipping a platform that preserves up to a petabyte of mission data through years of radiation exposure
Design and implement custom Linux kernel drivers for NVMe fault recovery and GPIO overcurrent protection, working across PCI/PCIe, block layer, and interrupt subsystems to detect and recover from radiation-induced upsets without data loss
Lead the ZFS pool topology architectural decisions by building quantitative reliability models that balance upset probability, resilver risk, and capacity over a 6+ year mission, then validate through fault injection testing
Develop the integration layer between NVMe controller reset and ZFS, ensuring that a drive recovering from a transient fault re-enters the storage pool cleanly, bridging driver-level recovery with filesystem-level fault tolerance
Rapidly prototype on commodity hardware, from first boot through sustained 10 Gbps writes with automated fault recovery, de-risking the architecture before committing to the target platform, then carry the design through integration and launch

Benefits

Relativity Space offers competitive salary and equity, a generous PTO and sick leave policy, parental leave, an annual learning and development stipend, and more!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume