Principal Software Engineer - AI & ML Innovation

Oracle•Seattle, WA

58d

About The Position

The Senior Principal AI/ML Software Engineer is responsible for evaluating, integrating, and optimizing cutting-edge technologies for AI/ML infrastructure, focusing on achieving low latency, high throughput, and efficient resource utilization for both model training and inference at scale. This role guides key strategic decisions related to Oracle Cloud's AI infrastructure offerings, spearheads the design and implementation of scalable orchestration for AI/ML workloads-incorporating the latest research in generative AI and large language models-and leads initiatives such as Retrieval-Augmented Generation and model fine-tuning. The ideal candidate will design and develop scalable, GPU-accelerated AI services using tools like Kubernetes and Python/Go, and must possess strong programming skills, deep expertise in deep learning frameworks, containerization, distributed systems, and parallel computing, along with a comprehensive understanding of end-to-end AI/ML workflows. As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sector-and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all. Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs. We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] or by calling +1 888 404 2494 in the United States. Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

Requirements

strong programming skills
deep expertise in deep learning frameworks
containerization
distributed systems
parallel computing
comprehensive understanding of end-to-end AI/ML workflows

Responsibilities

Evaluate, Integrate, and Optimize state-of-the-art technologies across the stack, for latency, throughput, and resource utilization for training and inference workloads.
Guide strategic decisions around Oracle Cloud's AI Infra offerings
Design and implement scalable orchestration for serving and training AI/ML models, Model Parallelism & Performance across the AI/ML Stack
Explore and incorporate contemporary research on generative AI, agents, and inference systems into the LLM software stack.
Lead initiatives in Generative AI systems design, including Retrieval-Augmented Generation (RAG) and LLM fine-tuning,
Design and develop scalable services and tools to support GPU-accelerated AI pipelines, leveraging Kubernetes, Python/Go, and observability frameworks.