Senior Software Engineer

AlluxioFoster City, CA
49d$190,000 - $260,000

About The Position

Alluxio powers the data layer for modern AI and analytics. Proven in production at eight of the top ten internet companies and seven of the ten highest-valued enterprises globally, Alluxio’s data orchestration platform unifies data across storage systems, regions, and clouds providing a high-performance distributed caching layer built for large-scale AI workloads. Spun out of UC Berkeley’s AMPLab by the creators of Tachyon and backed by Andreessen Horowitz, Hillhouse Capital, and Seven Seas Partners, Alluxio sits at the intersection of data, distributed systems, and AI infrastructure . Our technology is deployed at scale by organizations such as Meta, Uber, Tencent, TikTok, Alibaba, Expedia, Rakuten, Microsoft, and Walmart , orchestrating data for billions of operations per day. Learn more at alluxio.io or on Wikipedia . We’re looking for experienced distributed-systems engineers to join our Core Product team and advance the next generation of Alluxio’s data-orchestration engine - the foundation for AI and analytics at global scale. As a Senior Software Engineer, you’ll work on high-impact systems problems such as: 1. Optimizing metadata management, caching, and replication across thousands of nodes. 2. Designing concurrent, fault-tolerant services for multi-region and multi-cloud environments. 3. Evolving Alluxio’s storage abstraction and scheduling layer to support large-scale AI/ML data pipelines. 4. Collaborating with internal product teams to push the limits of distributed I/O performance. This is a hands-on, architecture-plus-implementation role for engineers who love deep systems work and want visible impact in a small, senior, highly technical team.

Requirements

  • Strong computer-science fundamentals and a passion for large-scale distributed systems.
  • Professional experience developing in Java, C++, or Go .
  • Practical knowledge of concurrency, replication, distributed coordination, and performance tuning .
  • Experience with distributed storage, caching, or data-access layers (e.g., Spark, Presto, Hadoop, Kubernetes).
  • Bachelor’s or advanced degree in Computer Science or related technical field (or equivalent experience).

Responsibilities

  • Cache and metadata enhancements - design and implement improvements to caching policies, eviction logic, and metadata scalability to increase performance and reliability.
  • Data path optimization - refine I/O pipelines for S3/GCS/HDFS/Posix to reduce latency and improve throughput using concurrency and scheduling techniques.
  • Distributed systems reliability - strengthen consistency, replication, and fault-tolerance mechanisms across large-scale clusters.
  • Feature development and integration - collaborate with product and solution-engineering teams to deliver features that support AI and analytics workloads.
  • Code quality and peer collaboration - participate in design reviews, provide constructive feedback, and ensure robust testing and observability in production systems.
  • Design, build, and optimize distributed components within Alluxio’s orchestration layer.
  • Investigate performance bottlenecks and propose scalable solutions using profiling, tracing, and benchmarking tools.
  • Collaborate cross-functionally with fellow engineers, architects, and the open-source community to drive improvements.
  • Contribute to releases and stability efforts , ensuring enterprise-grade reliability across global deployments.

Benefits

  • Build infrastructure trusted by the world’s largest AI and data-driven companies.
  • Join a small, senior engineering team where your designs shape the product’s evolution.
  • Work directly with the original creators of open-source Alluxio.
  • A culture of empathy, curiosity, and ownership - where engineers collaborate closely to solve hard problems.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service