Senior Engineer 2: Inference Data Plane

DigitalOceanSeattle, WA
3dRemote

About The Position

DigitalOcean is expanding its AI Infrastructure layer to support the next generation of AI-driven applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In this role, you will be a key technical leader responsible for designing, developing, and delivering high-scale, resilient data plane services that power our "Inference as a Service" offering. You will work at the intersection of distributed systems and specialized AI hardware to ensure our customers can deploy and scale their models with industry-leading performance and reliability. This is a hands-on role, requiring you to be able to develop high quality software while availing of all the productivity boosts granted by the latest AI coding agents.

Requirements

  • Distributed Systems Expertise: Strong experience with microservices, messaging systems, databases, and infrastructure as code.
  • AI/ML Domain Knowledge: Hands-on experience hosting large language or multimodal models using inference engines like vLLM, SGLang, or Modular.
  • Inference Frameworks: Familiarity with distributed inference serving frameworks such as llm-d, NVIDIA Dynamo, or Ray Serve.
  • Hardware & Interconnects: Understanding of GPU-level optimization and experience with interconnect technologies like NVlink, XGMI, or RoCE.
  • Architecture Proficiency: Knowledge of common LLM architectures and optimization techniques (e.g., continuous batching, quantization).
  • Software Engineering: Expert-level proficiency in GoLang or Python and familiarity with gRPC.
  • Cloud Operations: Proven experience shipping customer-facing software products and running critical services in a high-scale environment similar to DigitalOcean.
  • Open Source Mindset: Experience integrating and building with open-source software.

Responsibilities

  • Act as a technical leader on the team, driving the end-to-end design, development, and delivery of critical data plane components hosting large generative AI models.
  • Architect and refine system design proposals for our high-scale, multi-tenant AI inference cloud ecosystem, ensuring they meet rigorous availability and resiliency standards.
  • Implement and optimize distributed inference hosting using techniques like tensor/data parallelism, KV cache optimizations, and smart routing.
  • Work cross-functionally with Product Managers, customer-facing teams, and other engineering teams to align technical roadmaps with customer needs.
  • Coach and mentor junior engineers, fostering a culture of technical excellence and continuous improvement.
  • Maintain and operate critical, high-scale services, utilizing observability tools and defining SLOs to ensure superior platform health.

Benefits

  • You will work with some of the smartest and most interesting people in the industry.
  • We are a high-performance organization that will always challenge you to think big.
  • Our organizational development team will provide you with resources to ensure you keep growing.
  • We provide employees with reimbursement for relevant conferences, training, and education.
  • All employees have access to LinkedIn Learning's 10,000+ courses to support their continued growth and development.
  • Regardless of your location, we will provide you with a competitive array of benefits to support you from our Employee Assistance Program to Local Employee Meetups to flexible time off policy, to name a few.
  • We also provide equity compensation to eligible employees, including equity grants upon hire and the option to participate in our Employee Stock Purchase Program.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

501-1,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service