Sr GenAI Infra Specialist SA, AWS WWSO Startup

Amazon•Herndon, VA

15d•$153,600 - $228,600•Onsite

About The Position

This role is within the AWS WWSO Startup team, focusing on Generative AI infrastructure. The Specialist Solutions Architect will help define the future of technology on AWS Generative AI, with a focus on AI infrastructure for model training and inference optimization. Responsibilities include defining, building, and deploying strategies to accelerate the adoption of AWS compute, networking, and ML platform services with lighthouse Frontier AI model builders in the startup ecosystem. The role requires expertise at the intersection of AI infrastructure architecture and model optimization, advising customers on hardware requirements (GPU, Trainium, networking) and providing deep expertise in optimizing models for inference serving and distributed training at scale. AWS Specialist Solutions Architects (SSAs) are deep domain experts who work with customers on complex challenges, crafting scalable, flexible, and resilient technical architectures.

Requirements

Experience conveying complex technical concepts to both technical and business audiences
8+ years of experience in technology domain areas (e.g., systems engineering, cloud infrastructure, HPC, ML/AI, distributed computing)
3+ years of experience designing, implementing, or consulting on large-scale AI/ML infrastructure with hands-on experience on GPU-based computing, ML training infrastructure, and inference serving systems

Nice To Haves

Experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware, or experience with CUDA kernels or ML/low-level kernels
Experience with vLLM, SGLang, TensorRT or similar platforms in production environments, or experience in performant kernel development (CUTLASS, FlashInfer)
Experience with container orchestration for ML: EKS, Kubernetes operators for ML KubeRay, Karpenter, Keda, K8/DRA
Experience with HPC schedulers and managed platforms: Slurm, AWS PCS (Parallel Computing Service), SageMaker HyperPod
Experience with fine-tuning techniques: LoRA, QLoRA, RLHF, DPO, knowledge distillation, Quantization, KV optimization

Responsibilities

Work directly with key Startup customers in the GenAI model training and inference space to help them adopt and scale large-scale workloads on AWS.
Advise customers on AI infrastructure requirements and trade-offs, including GPU/Trainium selection, cluster topology, storage, networking (EFA), and cost optimization for training and inference.
Provide deep technical guidance on inference optimization model serving architectures (self-managed on EKS, SageMaker endpoints, SageMaker Hyperpod Serving), batching strategies, quantization, model parallelism, and latency/throughput tradeoffs.
Provide deep technical guidance on training optimization distributed training strategies, framework selection (PyTorch, JAX, NeMo), SageMaker HyperPod, Slurm/PCS integration, checkpointing, and data pipeline design.
Guide customers on GPU and accelerator profiling, identifying bottlenecks (compute, memory, I/O), optimizing utilization, and tuning system-level performance.
Help customers understand and apply model optimization techniques such as fine-tuning approaches (LoRA, QLoRA, full fine-tuning), RLHF/DPO, knowledge distillation, and efficient serving techniques (vLLM, TensorRT-LLM, Triton).
Help Go-To-Market Specialists define and drive strategy on assets that impact growth through market sizing, building an opportunity pipeline, creating technical content to train field teams, and establishing thought leadership.
Develop demos, proof-of-concepts, reference architectures, and benchmarks that demonstrate AWS infrastructure value proposition for GenAI workloads.
Collaborate with product teams (EC2, Trainium/Inferentia, SageMaker, EKS, PCS, EC2) to shape product vision, prioritize features, and represent the voice of the customer.
Work with account teams, research scientists, ISVs, framework communities, and model providers to drive implementations and accelerate innovation.

Benefits

health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
401(k) matching
paid time off
parental leave
sign-on payments
restricted stock units (RSUs)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume