Software Engineer, Agentic Systems - Moveworks

ServiceNow•Mountain View, CA

About The Position

Moveworks is seeking a Software Engineer to join their Agentic Systems team. This role focuses on building the runtime infrastructure that powers Moveworks' AI agents, which are responsible for orchestrating, executing, and delivering agent responses to millions of enterprise users in real time. This is a distributed systems engineering role, not an ML role, focused on building and owning systems that enable AI agents to plan, execute multi-step workflows, call tools, wait for human input, and resume while maintaining correctness, observability, and low latency. The systems you will build include an agent orchestration engine (state machine for long-running sessions), distributed session management (lease-based ownership, heartbeats, crash recovery), an event-driven message pipeline (SQS FIFO, Kafka), structured concurrency (Python asyncio TaskGroups), observability infrastructure (OpenTelemetry, distributed tracing), and caching/state layers (Redis, DynamoDB).

Requirements

  • Deep experience in at least 3 of the following areas: Distributed systems (consistency models, idempotency, exactly-once delivery, distributed locking/leasing), Concurrent/async programming (Python asyncio, Go goroutines, structured concurrency, cancellation handling), Event-driven architectures (message queues like SQS, Kafka, pub/sub, backpressure, delivery guarantees), Database systems for infrastructure (DynamoDB, Redis), Observability (OpenTelemetry, distributed tracing, span context propagation, Prometheus metrics), gRPC/protobuf (streaming RPCs, service interface design, error handling patterns).
  • 2+ years building production backend/infrastructure systems.
  • Strong in Python or Go (ideally both).
  • Experience designing and operating systems that handle real traffic at scale.
  • Comfort with ambiguity and novel problems without textbook solutions.

Responsibilities

  • Build and own the runtime infrastructure for Moveworks' AI agents.
  • Develop an agent orchestration engine that manages long-running agent sessions.
  • Implement distributed session management using lease-based ownership, heartbeats, and crash recovery.
  • Create an event-driven message pipeline for ordered delivery and real-time streaming.
  • Utilize structured concurrency with Python asyncio for concurrent tasks.
  • Develop observability infrastructure for distributed tracing and span management.
  • Build caching and state layers using Redis and DynamoDB.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service