The Data Center Customer Engineering team is seeking experienced MLOps Engineers to lead deployment and optimization of rack-scale deep learning workloads powered by Qualcomm Cloud AI inference accelerators. These accelerators leverage Qualcomm's expertise in hardware-accelerated AI to power high-performance, energy-efficient generative AI and computer vision inference workloads in the data center. In this role, you will collaborate closely with strategic partners and customers to drive seamless provisioning, orchestration, optimization, monitoring, and lifecycle management of end‑to-end deep learning inference pipelines on Qualcomm's Cloud AI data center deployments. Ideal candidates will bring a strong foundation in ML model deployment, systems engineering, rack-scale management software, DevOps/MLOps automation, and cross‑functional collaboration. This role involves the following activities: Ensure optimal performance, uptime and availability of Cloud AI data center deployments Manage Qualcomm Cloud AI Accelerator hardware for AI/ML workloads Commission and decommission equipment Oversee physical infrastructure: servers, storage, networking, power, cooling Deploy and maintain infrastructure-as-code tools Monitor and manage incident response, troubleshooting, root cause analysis, preventative measures Oversee software updates and maintenance Monitor usage trends and plan for infrastructure scaling Manage relationships with hardware, software service vendors Coordinate with internal teams: IT, engineering, security Provide regular reports on uptime, incidents, capacity and performance metrics Track KPIs and SLAs Ensure redundancy and failover mechanisms Document and enforce standard operating procedures Candidates for this position will demonstrate the following: Understanding of AI/ML inference workloads Strong problem-solving and diagnostic abilities Ability to work in high-pressure environments and respond to incidents quickly Strong attention to detail with a focus on quality and reliability Good communication skills for coordinating with teams Hands-on experience installing, troubleshooting and maintaining servers, storage devices, networking equipment and AI accelerators. Familiarity with Linux, bare-metal and virtualization platforms Familiarity with data center infrastructure Use of Data Center Infrastructure Management software and environmental monitoring systems Commitment to ongoing learning and professional development Knowledge of programming/scripting languages like Python or Bash Familiarity with cloud platforms
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees