Staff Software Engineer - Workflow Engine

DatadogNew York City, NY
13dHybrid

About The Position

We’re looking for an experienced engineer to join Datadog’s Workflow Engine Team, the group behind Atlas — our platform for building reliable, long-running workflows as code. Atlas is built on Temporal, the leading open-source workflow orchestration technology, and plays a central role in how Datadog builds complex distributed systems at scale. We work closely with the Temporal community and have contributed upstream improvements in areas like reliability, performance, and developer tooling, strengthening both Atlas and the broader ecosystem. As a Staff Engineer, you’ll help shape the future of Atlas: evolving its architecture, improving performance and resilience, and making it the go-to workflow platform across Datadog. You’ll collaborate with other senior engineers and product teams to solve hard distributed systems challenges while mentoring teammates and contributing to our engineering culture. This is a high-impact role with direct influence on how Datadog builds and operates critical workflows powering our products and services. At Datadog, we place value in our office culture - the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them.

Requirements

  • 8+ years of experience building large-scale, distributed systems in production
  • Deep expertise in systems programming, workflow orchestration, or related domains (job scheduling, stream processing, etc.)
  • Experience designing for durability and correctness in stateful systems
  • Skilled at making architectural decisions and leading complex projects
  • Fluent in at least one systems-level language (e.g., Go, Java, C++, Rust)
  • Collaborative, with a track record of mentoring and growing other engineers
  • You’re excited about leveraging AI tools to enhance how you code, solve problems, and build – or eager to learn how

Nice To Haves

  • Prior experience with Temporal or another workflow orchestration system

Responsibilities

  • Design and implement high-scale, reliable, and durable workflow execution infrastructure on top of Temporal
  • Lead the evolution of Atlas to meet Datadog’s growing scale and reliability needs, running many million of actions per minute
  • Support Datadog’s AI initiatives by evolving Atlas into the orchestration backbone for AI agents and enabling an AI-first development mindset internally
  • Partner with platform and product teams to make Atlas the standard for orchestrating workflows company-wide
  • Drive technical strategy for resilience, durability, and performance optimization
  • Mentor engineers and foster best practices in distributed systems development

Benefits

  • Competitive global benefits
  • New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
  • An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
  • Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service