About The Position

The Akamai Inference Cloud team is part of Akamai's Cloud Technology Group. We design, implement, deploy and operate AI platforms that enable customers to run inference models and developers to create AI applications with unmatched performance, compliance, and economics. As a Senior Engineering Manager, you will build and lead a high-performing team of platform and ML engineers from the ground up. Your team will design and develop our globally-distributed AI inference platform, delivering OpenAI-compatible endpoints and orchestrating inference workloads across regions.

Requirements

  • 10 years of relevant experience and a Bachelor's degree or its equivalent experience building and scaling high-performing teams that shipped successful AI/ML products
  • Possess hands-on experience with AI inference optimization, model serving, and LLM deployment at scale with deep knowledge of inference frameworks (TensorRT, vLLM, TorchServe, Triton)
  • Have an understanding of containerization strategies for AI workloads with hardware-specific optimizations, and possess opinions on what makes an AI platform successful
  • Show proficiency with cloud-native technologies including Kubernetes and distributed systems with proven experience operating services at global scale
  • Demonstrate expertise in building highly available, low-latency platforms with strict SLOs and cost optimization strategies for compute-intensive AI workloads

Nice To Haves

  • Possess knowledge of AI application platforms, AI safety and GPU infrastructure and hardware acceleration is ideal

Responsibilities

  • Building and scaling a world-class engineering team from the ground up, recruiting top talent in AI infrastructure and ML operations
  • Leading the technical strategy for a global AI inference platform that is performant, compliant, economical, and explainable
  • Providing the availability, performance, scalability, and security of Akamai Inference Cloud
  • Designing global traffic orchestration for AI workloads and establishing platform standards and blueprints for production-grade AI applications
  • Making critical decisions on AI tooling based on technical evaluation while ensuring compliance with regulatory requirements (i.e. FedRAMP, GDPR, SOX)

Benefits

  • FlexBase, Akamai's Global Flexible Working Program
  • healthcare
  • 401K savings plan
  • company holidays
  • vacation (in the form of PTO)
  • sick time
  • family friendly benefits including parental leave
  • employee assistance program including a focus on mental and financial wellness
  • Employee Stock Purchase Plan (ESPP)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service