Oracle-posted 2 days ago
Full-time • Principal
Seattle, WA
5,001-10,000 employees

At Oracle Cloud Infrastructure (OCI), we are building the future of cloud computing—designed for enterprises, engineered for performance, and optimized for AI at scale. We are a fast-paced, mission-driven team within one of the world’s largest cloud platforms. The Generative AI Service team within OCI is focused on developing infrastructure and tools to operationalize Large Language Models (LLMs) and agentic AI systems. Our goal is to empower developers and enterprises to deploy intelligent applications and agents that integrate seamlessly with cloud services. Role Summary As a Principal Software Engineer (IC4), you will contribute to the design and implementation of scalable, distributed systems that serve LLMs and support agent-based workflows. You will work in a collaborative environment with applied scientists, ML engineers, and software teams to deliver performant and reliable AI infrastructure. This is a high-impact engineering role with opportunities to grow technical expertise in large-scale systems and advanced AI technologies.

  • Contribute to the development and optimization of distributed systems for model inference and agent execution.
  • Implement features and enhancements in LLM service infrastructure using modern cloud technologies.
  • Collaborate with cross-functional teams to support scalable and secure deployment pipelines.
  • Assist in diagnosing and resolving production issues, improving observability and reliability.
  • Write maintainable, well-tested code and contribute to documentation and design discussions.
  • BS in Computer Science or related technical field.
  • 6+ years of experience in backend software development with cloud infrastructure.
  • Strong proficiency in at least one language such as Go, Java, Python, or C++.
  • Experience building and maintaining distributed services in a production environment.
  • Familiarity with Kubernetes, container orchestration, and CI/CD practices.
  • Solid understanding of computer science fundamentals such as algorithms, operating systems, and networking.
  • MS in Computer Science.
  • Experience working with LLM serving frameworks like vLLM, DeepSpeed, or FasterTransformer.
  • Exposure to agent-based AI systems or tool-based inference workflows.
  • Knowledge of cloud-native observability tools and scalable service design.
  • Interest in compiler or systems-level software design is a plus.
  • Medical, dental, and vision insurance, including expert medical opinion
  • Short term disability and long term disability
  • Life insurance and AD&D
  • Supplemental life insurance (Employee/Spouse/Child)
  • Health care and dependent care Flexible Spending Accounts
  • Pre-tax commuter and parking benefits
  • 401(k) Savings and Investment Plan with company match
  • Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
  • 11 paid holidays
  • Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
  • Paid parental leave
  • Adoption assistance
  • Employee Stock Purchase Plan
  • Financial planning and group legal
  • Voluntary benefits including auto, homeowner and pet insurance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service