Forward Deployed AI Engineer, Customer Adoption and Innovation

Red River

5d•Remote

About The Position

The Red Hat AI Customer Adoption and Innovation (CAI) team is looking for a North-America based Forward Deployed AI Engineer to join our rapidly growing AI Business Unit. As inference technologies become more mainstream, our customers are seeking deep expertise in optimization, scalability, and production readiness. In this role, you will act as a bridge between engineering and the customer's environment. You will be deployed to support lighthouse customer engagements, ensuring that Red Hat AI inference products are successfully implemented, tuned, and optimized to meet specific business requirements. We are looking for a hands-on practitioner who understands that technical implementation must serve a business constraint—whether that’s cost, latency, or throughput. You will work directly with customers to design and deploy and optimize complex AI inference solutions, while simultaneously capturing those lessons to enable our wider field teams. You must have significant experience (10+ years) as consultant or technical architect. You must also have a really good understanding of inference and inference optimization, as well as practical and relevant experience. While you will have the support of the wider CAI team to upskill on specific AI technologies, you must bring a strong consulting mindset and deep technical expertise in OpenShift or Kubernetes platform engineering as well as a deep understanding of LLMs, Generative AI, and Inference. This position can be remote, but the candidate has to be located in North America, United States preferred, and willing to travel as required, up to 20% of the time.

Requirements

Consulting & Architecture Experience (10+ years’ experience): Proven experience in a technical consulting, professional services, or solutions architect role. You are comfortable leading the delivery of complex technical solutions and managing customer expectations in a post-sales or implementation environment.
Deep OpenShift or Kubernetes Expertise (5+ years’ experience): You possess extensive hands-on experience with OpenShift or Kubernetes. You deeply understand how to deploy, scale, and manage complex workloads, operator lifecycles, and resource quotas in a containerized environment.
Performance & Optimization Mindset: You have a background or strong interest in system performance. You understand concepts regarding latency, throughput, and efficient resource utilization.
Inference background (2+ years’ experience): You should already have familiarity with inference technologies such as Kserve, vLLM, and potentially llm-d.
Functional Python Skills: You are capable of reading and writing Python code to script automation or interact with necessary libraries.
Communication Skills: Excellent written and verbal communication skills in English. You can confidently present to audiences ranging from operations teams to business leadership.

Nice To Haves

Familiarity with the AI Stack: Experience with tools like llm-compressor, guidellm, etc..
Networking Knowledge: Understanding of networking concepts (L7/Gateway API) or high-performance computing networking.
Model Tuning Experience: Exposure to post-training techniques such as knowledge distillation, LoRA/QLoRA, or quantization.

Responsibilities

Lead Lighthouse Implementations: Lead the technical delivery for critical, high-profile customer Proofs of Concept (POCs) and production pilots. You will be the primary technical expert hands-on with the customer, helping them navigate the complexities of LLM inference in their specific clusters.
Optimization & Architecture: Provide expert advice on inference sizing, configuration, and resource management. You will guide customers on how to best configure their OpenShift environments to support computationally intensive AI workloads.
Field Enablement & Asset Creation: Enable our field teams by turning lessons from customer engagements into reusable assets. You will develop reference architectures, field manuals, and validated patterns that allow other AI specialists to execute similar engagements independently.
Stakeholder Communication: Translate technical metrics into business value. You will be expected to communicate effectively with both technical teams (DevOps, SREs) and business stakeholders to justify architecture decisions.
Product Feedback Loop: Act as a liaison between the customer and the Product and Engineering teams. You will ensure that real-world feedback regarding platform performance and usability is properly prioritized in the product roadmap.

Benefits

Comprehensive medical, dental, and vision coverage
Flexible Spending Account - healthcare and dependent care
Health Savings Account - high deductible medical plan
Retirement 401(k) with employer match
Paid time off and holidays
Paid parental leave plans for all new parents
Leave benefits including disability, paid family medical leave, and paid military leave
Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume