About The Position

OCI (Oracle Cloud) AI Infrastructure Innovation team is inventing the next generation of storage technologies. You will lead architecture and hands-on development across key layers: distributed processing, transactions, consensus, and storage engines. If you thrive at the intersection of large-scale distributed systems, database internals, and cloud platforms, this role offers the opportunity to advance the state of the art.

Requirements

  • Deep expertise in distributed systems with hands-on delivery of large-scale, fault-tolerant, strongly consistent services.
  • Experience building distributed execution engines.
  • Proven ability to design for global scale: sharding/partitioning, placement policies, rebalancing, and multi-region replication.
  • Strong software engineering background with performance profiling, correctness testing, and rigorous code quality.
  • Cloud architecture experience on a major public cloud, including observability, orchestration, and incident response.
  • BS/MS in Computer Science, Electrical/Computer Engineering, or equivalent practical experience; proven technical leadership and mentoring.

Nice To Haves

  • Familiarity with high-performance IO paths; understanding of cross-region networking and latency trade-offs.
  • Strong foundation in consensus and transactions.
  • Expertise with observability at scale: tracing, metrics, logs, eBPF/perf, chaos/failure testing, and SLO-driven operations.
  • Knowledge of AI/HPC workload patterns and their implications for storage, query processing, and consistency models.

Responsibilities

  • Lead end-to-end architecture, system design, and implementation for distributed storage platforms.
  • Innovate on query processing, transaction, and IO performance, and work across different components - query planning/optimization, distributed execution engine, index and storage engine.
  • Develop production-grade, high-performance software features with rigorous durability, correctness, observability, and security.
  • Define performance goals and success metrics; design benchmarks and conduct large-scale experiments to validate throughput and latency.
  • Define consistency, replication, and recovery strategies.
  • Collaborate across storage, networking, compute, and control-plane teams to deliver cohesive end-to-end solutions on OCI.
  • Mentor engineers, provide technical leadership and reviews, and influence multi-year roadmap and technical standards.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service