About The Position

Join Oracle Cloud Infrastructure’s Compute team to design, build, and scale the next generation of bare-metal provisioning systems powering millions of servers worldwide. As a senior engineer, you will develop highly reliable and secure infrastructure, tackle complex distributed systems challenges, and help deliver the foundation for OCI’s most performant compute services. Oracle Cloud Infrastructure (OCI) is building the next generation of cloud services to support the world’s most demanding workloads. The Compute team is responsible for delivering bare-metal provisioning infrastructure that powers millions of servers and forms the foundation of OCI’s rapidly expanding AI infrastructure. The Compute Bare Metal Provisioning team owns the critical infrastructure responsible for automating the full server lifecycle from new platform shape (AMD/Intel/Arm/Nvidia) creation, hardware bring-up to customer-ready instance provisioning and firmware management. The services operate at the intersection of bare metal hardware and full-stack orchestration frameworks, a unique combination where both distributed systems engineers and engineers with background in Linux and firmware are highly valued. The team interfaces directly with components like BMCs, NICs, SmartNICs, ILOMs, GPUs, and custom firmware stacks. The team builds high performance, scalable micro-services and tooling that provision, configure, secure, and validate server platforms across OCI’s massive fleet of Compute and GPU Infrastructure. You will partner closely across other teams in Compute, Networking, Security, Data center Engineering, and Hardware Development to ensure OCI can launch, scale, and maintain new server platforms with minimal operational overhead and high reliability. You will work directly with cutting edge GPU hardware and see the direct impact of your work on the business. We strive for equity, inclusion, and respect for all. We are committed to the greater good in our products and our actions. We are constantly learning and taking opportunities to grow our careers and ourselves. We challenge each other to stretch beyond our past to build our future. You are the builder here. You will be part of a team of really smart, motivated, and diverse people and given the autonomy and support to do your best work. It is a dynamic and flexible workplace where you’ll belong and be encouraged. If you are interested in building large-scale distributed infrastructure for the cloud, want to work on cutting edge GPU infrastructure and the latest Compute systems, have a knack for distributed systems and/or Linux development with Systems experience then this is your team! Oracle is aggressively investing in the Oracle Cloud to provide the broadest, most comprehensive cloud in the industry.

Requirements

  • 3- 8+ years experience delivering and operating large scale, highly available distributed systems, Linux development and Systems debugging.
  • Strong knowledge of Object Oriented programming such as C++ or Java, and experience with scripting languages such as Python.
  • Strong knowledge of data structures, algorithms, operating systems, and distributed systems fundamentals.
  • Experience with tools such as Terraform for Infrastructure as Code.
  • Working familiarity with networking protocols (TCP/IP, HTTP) and standard network architectures.
  • Strong understanding of databases, storage and distributed persistence technologies.
  • Strong troubleshooting and performance tuning skills.

Nice To Haves

  • Experience building multi-tenant, virtualized infrastructure a strong plus.

Responsibilities

  • As a Senior Member of Technical Staff, you will own the software design and development for major components of Oracle’s Cloud Infrastructure.
  • You should be both a rock solid developer, driven problem solver and a distributed systems generalist and/or Linux developer with Systems experiance able to dive deep, design, develop, operate, and debug any part of the stack and low level systems such as Linux, Docker, Java web services and Terraform, as well as design broad distributed system interactions.
  • You should have a tenacious attitude to improve the status quo, independently seek out problems to solve and take action to deliver results wherever needed.
  • You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.

Benefits

  • Medical, dental, and vision insurance, including expert medical opinion
  • Short term disability and long term disability
  • Life insurance and AD&D
  • Supplemental life insurance (Employee/Spouse/Child)
  • Health care and dependent care Flexible Spending Accounts
  • Pre-tax commuter and parking benefits
  • 401(k) Savings and Investment Plan with company match
  • Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
  • 11 paid holidays
  • Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
  • Paid parental leave
  • Adoption assistance
  • Employee Stock Purchase Plan
  • Financial planning and group legal
  • Voluntary benefits including auto, homeowner and pet insurance

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service