Embedded HPC Engineer (Real-Time Edge Compute)

Metrea Management LLCVictor, NY
7dOnsite

About The Position

The Embedded HPC Engineer (Engineer II) develops and integrates embedded software components that mature RADLAB radar R&D algorithms into a robust, testable, real-time processing system on edge CPU/GPU platforms. This role focuses on implementing high-rate sensor ingest pipelines, building record/replay and validation infrastructure, integrating SDR and compute platforms, and optimizing performance to meet defined latency, throughput, and determinism targets under guidance from senior technical leadership.

Requirements

  • BS/MS in Electrical Engineering, Computer Engineering, Computer Science, or equivalent experience; at least 3–6+ years of focus in embedded and/or high-performance real-time systems.
  • Experience in platform bring-up and integration: drivers/BSP concepts, device configuration, high-rate streaming enablement, and diagnosing issues across hardware/firmware/software boundaries.
  • Strong C/C++ systems programming: concurrency, memory management, profiling, and maintainable interface design.
  • Experience with low-latency / high-throughput streaming systems: buffering, loss detection, telemetry, failure modes, and graceful degradation mechanisms.
  • Working knowledge of real-time design concepts: bounded latency, avoiding lock contention, timing measurement, and meeting latency/jitter requirements under load.
  • Evidence-driven debugging skills using professional tools such as JTAG/GDB where applicable, bus/packet traces, logic analyzers/scopes, perf/ftrace, logs/metrics.
  • Experience building automated acceptance/regression tests including golden vector validation, tolerances, metrics, structured pass/fail gates and integrating them into CI/CD.
  • Strong written communication for validation reports, integration risk tracking, and mitigation strategies.
  • U.S. Citizenship
  • Ability to obtain and maintain a TS security clearance.
  • Strong communication and collaboration skills for cross-functional team environments.
  • Commitment to reproducibility, documentation, and engineering and mathematical rigor.

Nice To Haves

  • Experience with Ettus/UHD SDR ecosystems, high‑throughput multi‑core/HPC systems, and precise time/clock synchronization (timestamping, disciplined references, metadata integrity).
  • Linux performance engineering skills such as affinity, scheduling, network/I/O tuning and/or RTOS experience.
  • Radar algorithm exposure sufficient to validate implementation fidelity (e.g., matched filtering, Doppler processing, calibration, detection chains) and recognize algorithm-specific failure modes.
  • GPU experience a plus: CUDA or GPU-accelerated signal-processing work, including asynchronous pipelines and throughput measurement; experience with GPUDirect RDMA/Storage is a strong plus but not required.
  • FPGA via HLS a plus: experience using C++-based HLS or collaborating with FPGA engineers; ability to reason about throughput/latency constraints and streaming dataflow.

Responsibilities

  • Implement real-time software components that operationalize RADLAB radar algorithms, including dataflow stages, buffer management, error handling, and state management.
  • Support SDR and compute-platform bring-up by executing baseline boot/configuration procedures, integrating drivers and device APIs, enabling streaming modes, and performing stability/soak testing under sustained high-rate load; escalate and document integration issues with clear repro steps.
  • Develop and maintain high‑throughput data pipelines from ingest through embedded compute and into high‑speed storage, including buffering strategies, backpressure control, integrity verification, and record/replay workflows.
  • Contribute to deterministic behavior by implementing concurrency, prioritization, and bounded-latency patterns (e.g., lock avoidance, bounded queues, careful memory ownership) to meet predefined timing budgets and jitter targets; measure and report timing performance.
  • Integrate standard interfaces across the stack such as device control APIs, networking/high-speed I/O, storage, time sync, and telemetry.
  • Validate algorithm implementation fidelity by integrating reference implementations, running golden vectors/recorded datasets, computing defined equivalence metrics/tolerances, and investigating mismatches with disciplined root-cause analysis.
  • Perform performance profiling and optimization of assigned subsystems using established tools and methods (e.g., CPU affinity guidance, memory reuse, reduced copies, cache-aware data layouts, pipeline parallelism); propose and prototype improvements with evidence.
  • Build and execute acceptance and regression tests using golden vectors and recorded datasets; produce concise validation notes with pass/fail criteria, performance measurements, and issue triage outcomes.
  • Develop and maintain test rigs, bench procedures, and automation for repeatable bring-up and regression testing; contribute to CI/CD for hardware-coupled system, including build artifacts, smoke tests, on-target hooks.
  • Provide engineering feedback on constraints discovered during implementation, such as throughput, latency, memory, quantization, real-time feasibility, to R&D (Algorithm) Engineers, Hardware Engineers, and the Head of RADLAB, and implement mitigation plans and contribute to redesigning software strategies when needed.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service