AI Researcher – Datadog AI Research

DatadogNY
314d$130,000 - $265,000

About The Position

Datadog has recently expanded its AI Research initiatives. Building on our proven track record of AI-powered solutions (e.g., Bits AI, Watchdog, and Toto), our research team is tackling high-risk, high-reward projects grounded in real-world challenges in cloud observability and security. We are currently focused on three key research areas: Observability Foundation Models, Site Reliability Engineering (SRE) Autonomous Agents, and Production Code Repair Agents. As a researcher on our team, you will help drive these efforts—working on fundamental research problems and collaborating with Datadog’s Product and Engineering teams to help translate research advances into tangible benefits for our customers.

Requirements

  • PhD in Computer Science, Machine Learning, or a related field with deep expertise in areas like generative modeling, AI agents, reinforcement learning, or natural language processing.
  • Extensive experience in designing and implementing deep learning models, and a strong background in distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and ML libraries (PyTorch, TensorFlow).
  • Proven track record of conducting impactful research in the field with publications at top-tier venues (e.g., NeurIPS, ICLR, ICML, TMLR).
  • Familiarity with efficient training, fine-tuning, and inference techniques for large foundation models.
  • Ability to explain complex models and research findings to both technical and non-technical audiences.
  • Strong interest in open-science and open-source contributions, including establishing rigorous benchmarks and sharing research with the community.

Nice To Haves

  • Demonstrated ability to bridge cutting-edge research and real-world product applications, ideally with an emphasis on large foundation models, generative AI agents, or domain-specific LLM deployments.
  • Passionate about pushing the boundaries of AI while maintaining a strong focus on customer impact, scalability, and responsible deployment of new technologies.
  • Hands-on experience with GPU programming and optimization, including experience in CUDA.
  • Experience writing production data pipelines and applications.

Responsibilities

  • Conduct cutting-edge research in Generative AI and Machine Learning, aiming to build specialized Foundation Models and AI Agents for observability, site reliability engineering, and code repair.
  • Leverage large-scale distributed training infrastructure to train and fine-tune state-of-the-art models on diverse, real-world telemetry data.
  • Lead and contribute to research publications, present findings at top-tier conferences (e.g., NeurIPS, ICLR, ICML), and help open-source key model artifacts and benchmarks.
  • Collaborate with cross-functional teams (e.g., Product, Engineering) to integrate advanced AI capabilities into Datadog’s product ecosystem.
  • Stay at the forefront of LLMs, Foundation Models, and Generative AI research and engage with the external research community.
  • Foster a culture of scientific rigor, innovation, and practical impact, e.g., by actively participating in reading groups and mentoring interns.

Benefits

  • Competitive global benefits.
  • New hire stock equity (RSUs) and employee stock purchase plan (ESPP).
  • Opportunity to collaborate closely with colleagues across the Datadog offices in New York City and Paris.
  • Opportunity to attend and present at conferences and meetups.
  • Intra-departmental mentor and buddy program for in-house networking.
  • An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups).
  • Healthcare, dental, parental planning, and mental health benefits.
  • 401(k) plan and match.
  • Paid time off.
  • Fitness reimbursements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service