About The Position

abra R&D is looking for an Embedded Engineer to help build the first real-time database purpose-built for AI agents at scale. Designed around time-series and unstructured data, it leverages a custom storage format optimized for append-heavy, real-time workloads. Joining the team means working on a high-performance execution engine built with vectorized execution, SIMD, and cache-efficient memory layouts, enabling extreme low-latency and high-throughput performance at scale. The system is engineered to fully utilize modern hardware, with deep optimization across CPU cache layers (L1 / L2 / L3), pushing the limits of real-time data processing.

Requirements

  • 7+ years of experience in high-performance systems programming (C/C++)
  • Deep understanding of computer architecture (CPU pipelines, cache hierarchies, memory access patterns)
  • Strong experience with low-level optimization and profiling tools
  • Proven knowledge of multithreaded development (lock and lockfree)
  • Expertise in algorithms and data structures, especially cache-aware designs
  • Experience in one or more of the following: Database internals (query engines, storage engines, query planners/optimizers) Time-series or real-time data systems High-performance systems (trading systems, game engines, networking stacks, compilers) Distributed systems (sharding, partitioning, consistency models)

Nice To Haves

  • Experience with vectorized execution engines (e.g., DuckDB-style processing)
  • Experience designing custom storage formats or low-level data layouts
  • Experience handling unstructured or semi-structured data at scale
  • Background in query optimization and execution planning

Responsibilities

  • Design and implement core components of the database engine (query engine, execution engine, storage engine)
  • Build vectorized execution pipelines optimized for SIMD
  • Design and evolve a time-series optimized storage engine (custom on-disk + in-memory formats)
  • Work on unstructured / event-driven data representations and efficient indexing/querying strategies
  • Own memory layout, compression, and data defragmentation
  • Develop cache-aware / cache-efficient data structures with deep understanding of CPU cache behavior
  • Implement distributed data primitives: sharding, partitioning, replication, and data locality
  • Profile and optimize performance at the CPU level (cache misses, branch prediction, memory bandwidth)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service