About The Position

Oracle Cloud Infrastructure (OCI) Compute delivers bare metal and virtual machine instances—CPU and GPU—at global scale. We are hiring a Consulting Member of Technical Staff to provide hands-on technical leadership for OCI compute control plane services, with a focus on Imaging and Container Registry. You will architect and deliver distributed systems that are multi-tenant, highly available, and horizontally scalable across OCI regions, while driving operational excellence and cross-team technical alignment. We are looking for a hands-on senior engineer with technical breadth, proven experience in solving cloud scale problems, distributed systems design & implementation experience to build fault tolerant solutions that will form the foundations of the next generation of Compute offerings. The candidate is expected to have strong written and verbal communications skills, the ability to lead projects across organizational boundaries, and experience representing their work to senior leaders. Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives. True innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs. We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] [[email protected]] or by calling 1-888-404-2494 in the United States. Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

Requirements

  • BS or MS degree in Computer Science/Engineering or a related IT field or equivalent experience relevant to functional area.
  • 10+ years of development experience with large scale, highly available distributed systems
  • Proficiency in Java programming patterns, Programming experience with Scala, Python is preferred.
  • Advanced knowledge of data structures, algorithms, and operating systems.
  • Experience with operating distributed services at scale
  • Expertise in Linux and operating systems
  • Systematic problem-solving approach, strong communication skills, strong ownership and drive
  • Deep understanding of service metrics and alarms through the development of dashboards, service KPIs, alarming systems
  • Propose, scope, design and direct automation, optimizations, and enhancements
  • Proven ability to drive technical outcomes, take ownership of deliverables, and work independently in fast-evolving AI solution spaces.
  • Strong communication skills, with the ability to articulate technical concepts and document solution approaches, collaborate across multiple geo distributed teams.
  • Demonstrated problem-solving ability leveraging AI, distributed systems, and cloud-native application behaviors.
  • A proactive, experimentation-oriented mindset with a strong willingness to learn and guide team on emerging AI technologies, frameworks, and engineering patterns.

Nice To Haves

  • Experience in management and automation of end-to-end CPU/GPU lifecycles at scale
  • Proficiency with Cloud and CICD environments
  • Proficiency with Kubernetes, OS Images and Terraform
  • Proficiency with modern build tools and pipelines
  • Proficiency building multi-tenant, virtualized infrastructure
  • Proficiency with change control management and mature operating processes
  • Proficiency with Security including Identity, SSL and certificates
  • Proficiency with Database and Data Stores

Responsibilities

  • Architect, design, and operate distributed, highly available, and resilient systems for multi-tenant, horizontally scalable, and cost-efficient architectures that deliver consistent latency, throughput, and durability across OCI regions.
  • Collaborate cross-functionally with Compute, Storage, Networking, OKE and functions to deliver new platform features focusing on Imaging, Container Registry Services, enforce secure-by-default designs, and improve overall services reliability.
  • Mentor and guide engineers in distributed systems design, high-scale data processing, and operational excellence; set and raise engineering standards across multiple teams.
  • Drive operational excellence by owning service-level objectives (availability, latency, durability) and reducing toil through automation, observability, and self-healing mechanisms.
  • Own the full service lifecycle from design and implementation to deployment, on-call, and continuous improvement — maintaining high code and reliability standards.
  • Define and drive the technical roadmap for OCI Imaging, Container Registry Services .
  • Partner with product management and field teams to translate customer needs into roadmap priorities for Oracle Imaging & Container Registry Services.
  • Contribute to the broader Compute vision, influencing how Oracle’s Imaging & Container Registry Services evolve to support mission-critical workloads globally.

Benefits

  • flexible medical
  • life insurance
  • retirement options
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service