OCI is driving development of next generation hyperscalar GPU data centers built on Nvidia and AMD GPUs. OCI enables popular AI services such as openAI on GPU compute servers. We are looking for engineers experienced in working with GPU device drivers and the runtime libraries (CUDA and ROCM). You must understand GPU architectural concepts such as UVM, host to device and device to host interactions including able to quantify performance issues in all such interactions. We are looking for strong experience in building and debugging issues that occur in the GPU drivers and Linux kernels that interact with GPU stack including functional and performance issues when running GPU AI/ML/inference workloads. The candidate should be able to use all standard tools targeted performance and stress such as DCGM, NCCL and RCCL suites. In addition, we are looking for experience debugging and diagnosing issues in the system reported via RAS events notified via the GPU BMC and other monitoring agents. The candidate should have breath knowledge in BIOS, CPU and GPU BMC and must show strong proficiency in C programming and working knowledge in Python or other scripting language used in AI/GPU environments. As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sector-and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all. Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs. We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] or by calling +1 888 404 2494 in the United States. Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Industry
Publishing Industries
Education Level
No Education Listed
Number of Employees
5,001-10,000 employees