We’re seeking a Principal Engineer to serve as the hands-on technical leader for our next-generation Inference Platform . As a senior individual contributor, you will architect and build the fastest, most cost-effective, and most reliable GPU inference services in the industry. You’ll prototype new capabilities, drive engineering standards, and work shoulder-to-shoulder with engineering, product, orchestration, and hardware teams to make CoreWeave the best place on earth to serve frontier models in production. About the role: Technical Vision & Strategy - Define the technical roadmap for ultra-low-latency, high-throughput inference. Evaluate and influence adoption of runtimes and frameworks (Triton, vLLM, TensorRT-LLM, Ray Serve, TorchServe) and guide build-vs-buy decisions. Platform Architecture - Design Kubernetes-native control-plane components that deploy, autoscale, and monitor fleets of model-server pods spanning thousands of GPUs. Implement advanced optimizations: micro-batching, speculative decoding, KV-cache reuse, early-exit heuristics, tensor/stream parallel inference, to squeeze every microsecond out of large-model serving. Build intelligent request routing and adaptive scheduling to maximize GPU utilization while guaranteeing strict P99 latency SLAs. Operational Excellence - Create real-time observability, live debugging hooks, and automated rollback/traffic-shift for model versioning.Develop cost-per-token and cost-per-request analytics so customers can instantly select the ideal hardware tier. Hands-on Development - Write production code, reference implementations, and performance benchmarks across gRPC/HTTP, CUDA Graphs, and NCCL/SHARP fast-paths. Lead deep-dive investigations into network, PCIe, NVLink, and memory-bandwidth bottlenecks. Mentorship & Collaboration - Coach engineers on large-scale inference best practices and performance profiling. Partner with lighthouse customers to launch and optimize mission-critical, real-time AI applications.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Number of Employees
501-1,000 employees