About The Position

Roblox storage team plays a fundamental role in enabling the company's success by designing and running highly scalable and secure data storage systems across geo-regions globally. As a Principal Software Engineer on the team, you will lead the development of an innovative object store, design distributed software and tools to manage storage systems that support exabyte-scale data size, trillions objects and handle terabyte/sec throughput. As a Principal Software Engineer on the Object Store team within the Infrastructure Storage organization, you will be instrumental in enhancing and operating our large-scale distributed systems. Your contributions will address Roblox's continuously expanding business demands by focusing on architecture, performance tuning, and capacity planning of our storage platform. This role also entails ensuring durability and availability, implementing robust security measures, and automating operations to support critical backend systems for many critical use cases in Roblox.

Requirements

  • Experience in designing, delivering & operating large-scale Object Store technology handling at least petabyte scale
  • Deep domain knowledge in Ceph, or other similar Object storage technologies, is a plus
  • Builder mindset to run large-scale Active/Active distributed systems on top of container orchestrators like Kubernetes or Nomad, and service discovery systems like Consul
  • Proficiency in programming languages like C++ or Golang
  • BS degree (or equivalent professional experience) in Computer Science, with at least 8 years of hands-on working experience

Responsibilities

  • Improve & scale our large distributed 24x7 services and deliver features focused on cost efficiency, 4+ 9s availability, and elastic scalability
  • Have a leading role in designing, implementing and running our internal Infra-as-a-Service offerings on top of a container orchestrator kubernetes platform
  • Design and build frameworks/tools to automate development, testing, cluster management and monitoring of mission-critical services
  • Improve SLA of the offering services and end-end rollout time of our suite of software solutions
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service