Sr. Systems Performance Engineer

Aurelius SystemsSan Francisco, CA
Onsite

About The Position

Aurelius Systems is a VC-backed defense tech startup focused on building autonomous, edge-deployed directed energy systems for counter-UAS. The company is developing laser weapon systems to neutralize drones. The Sr. Systems Performance Software Engineer will be responsible for the architecture and performance of the entire software stack, which includes complex subsystems such as sensing, computer vision, ML inference, controls, fire control, C2, power, and mechanical actuation. The role emphasizes minimizing latency and optimizing efficiency across the stack to ensure the system operates at its physical limits. A critical aspect of this role is real-time systems performance architecture at the hardware boundary, understanding the translation of software execution to physical system behavior, latency accumulation across hardware components, bandwidth limitations, and how architectural decisions impact downstream performance. This is a senior individual contributor role with potential for subteam lead responsibilities, focusing on kernel-level optimization, driver work, and latency performance in fire control and C2.

Requirements

  • 4+ years in real-time systems or robotics software engineering with real hardware experience.
  • Expert-level modern C++ (C++17/20).
  • Driver-level and kernel-level coding experience.
  • CUDA kernel optimization for throughput and latency.
  • Deep understanding of GPU memory models (global, shared, unified memory).
  • Real-time pipeline architecture experience.
  • ARM + Linux systems development (cross-compiling, profiling, kernel-level awareness).
  • Performance optimization across CPU/GPU boundaries.
  • Shared memory and lock-free architecture design.
  • Experience with high-throughput peripheral data ingestion (USB, PCIe, Ethernet).
  • Experience with multithreaded systems and concurrency optimization.
  • Must be a "U.S. Person" (U.S. citizen, legal permanent resident, or certain protected classes of asylees and refugees).

Nice To Haves

  • Jetson platform experience.
  • DMA and zero-copy pipeline design.
  • Video pipeline experience (OpenCV, GStreamer, Vimba/Pylon).
  • Experience with CoaXPress, USB3 Vision, or high-speed camera systems.
  • Linux kernel contributions or driver-level experience.
  • Prior experience leading a small technical subteam or owning architecture for a production system.

Responsibilities

  • Own the architecture, performance, and latency budget of the full platform, from sensing through actuation.
  • Perform kernel-level and driver-level coding across the stack.
  • Profile and eliminate latency across CPU, GPU, memory, and I/O boundaries.
  • Develop and optimize CUDA kernels for high-throughput, low-latency execution.
  • Tune memory access patterns (global, shared, unified) for bandwidth efficiency.
  • Make real-time architecture decisions across fire control and C2.
  • Design high-bandwidth sensor data ingestion and pipelines.
  • Provide technical direction and mentorship for a subteam of engineers focused on optimization and real-time systems.
  • Identify development priorities by analyzing technical and physical system limitations in the field.
  • Author architecture documentation and standards for the engineering team.

Benefits

  • Competitive salary + equity
  • United Health Care medical, dental, and vision coverage
  • Flexible 18 days PTO + 5 sick days
  • Travel to field test events and range days
  • Covered daily lunches and office snacks + drinks
  • E-bike / scooter stipend (up to $500)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service