Staff Software Engineer, Agentic Systems - Moveworks

ServiceNowMountain View, CA
Onsite

About The Position

Moveworks is the Agentic AI Assistant platform that empowers the entire workforce. Our platform enables employees to converse with all of their business systems through natural language to quickly find answers and automate tasks. Powered by the world's most advanced LLMs, our proprietary models, and a sophisticated Agentic AI platform, we're transforming how work gets done by allowing AI to take initiative, streamline complex workflows, and continuously learn and adapt. Moveworks is trusted by over 5.5 million employees at more than 350 of the world’s largest companies, including 10% of the Fortune 500, to automate everyday tasks and streamline business operations. Recognized on the Forbes Cloud 100 and AI 50 lists, Moveworks was also named one of Fast Company’s 2025 Most Innovative Companies and Inc’s Best in Business, in the Best in Innovation category. Moveworks was also recognized at Microsoft’s 2025 Partner of the Year and in 2024, received the AI Breakthrough Award. In December 2025, Moveworks was acquired by ServiceNow, marking a pivotal milestone in our journey to create a single front door to work for all business systems. By combining ServiceNow’s leading workflow automation with Moveworks’ Reasoning Engine and natural language capabilities, we deliver the AI platform for every person and every workflow. Built to go beyond basic summaries to deliver meaningful business impact. Together, our AI acts across enterprise systems to turn conversations into completed work. By joining our team, you’ll be at the forefront of the AI transformation, backed by the global scale of ServiceNow and the agility of a high-growth company. We are looking for world-class talent to help us extend agentic AI to every employee across every corner of the business. Come join us! ServiceNow It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone. The Role We're building the runtime infrastructure that powers Moveworks' AI agents — the systems that orchestrate, execute, and deliver agent responses to millions of enterprise users in real time. This is not an ML role. This is a distributed systems engineering role at the heart of the agentic AI wave. Our AI agents can plan, execute multi-step workflows, call tools, wait on human input, and resume — all while maintaining correctness, observability, and low latency. The systems that make this possible are what you'll build and own. What you get to do in this role: Agent orchestration engine — A state machine that manages long-running agent sessions, coordinating planning, execution, and user interaction across multiple LLM calls and tool invocations Distributed session management — Lease-based ownership using DynamoDB conditional writes, heartbeat protocols, and crash recovery via checkpointing Event-driven message pipeline — SQS FIFO queues for ordered delivery, Kafka consumers for event processing, and real-time streaming via gRPC and Socket.IO Structured concurrency — Python asyncio TaskGroups running multiple concurrent tasks per session (message polling, lease heartbeats, output publishing, orchestrator execution) with fail-fast semantics and graceful cancellation Observability infrastructure — OpenTelemetry instrumentation, distributed trace context propagation across async boundaries, custom span lifecycle management for sessions that span minutes Caching and state layers — Redis, DynamoDB KV stores with per-org/per-bot scoping, batch read optimization, and hot-reload configuration

Requirements

  • Deep experience in at least 3 of the following areas: Distributed systems (consistency models, idempotency, exactly-once delivery, distributed locking/leasing), Concurrent/async programming (Python asyncio, Go goroutines, structured concurrency, cancellation handling), Event-driven architectures (message queues (SQS, Kafka), pub/sub, backpressure, delivery guarantees), Database systems for infrastructure (DynamoDB (conditional writes, transactions), Redis (connection pooling, pub/sub)), Observability (OpenTelemetry, distributed tracing, span context propagation, Prometheus metrics), gRPC/protobuf (streaming RPCs, service interface design, error handling patterns).
  • 7+ years building production backend/infrastructure systems
  • Strong in Python or Go (ideally both)
  • Experience designing and operating systems that handle real traffic at scale
  • Comfort with ambiguity — these are novel problems without textbook solutions

Responsibilities

  • Build and own the runtime infrastructure that powers Moveworks' AI agents.
  • Develop systems that orchestrate, execute, and deliver agent responses to millions of enterprise users in real time.
  • Create an agent orchestration engine, a state machine managing long-running agent sessions.
  • Implement distributed session management using DynamoDB conditional writes, heartbeat protocols, and crash recovery.
  • Design and build an event-driven message pipeline using SQS FIFO queues, Kafka consumers, and real-time streaming via gRPC and Socket.IO.
  • Utilize structured concurrency with Python asyncio TaskGroups for concurrent tasks within sessions.
  • Develop observability infrastructure with OpenTelemetry, distributed trace context propagation, and custom span lifecycle management.
  • Implement caching and state layers using Redis and DynamoDB KV stores.

Benefits

  • Flexible scheduling
  • Remote work options
  • Required in office work persona
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service