Staff Engineer

WorkatoPalo Alto, CA
Remote

About The Position

Workato delivers enterprise infrastructure for the agentic era, redefining iPaaS and helping enterprises unify data, applications, processes, and AI into a single, governed platform. A leader in Enterprise MCP and trusted by 50% of the Fortune 500, Workato’s cloud-native architecture connects every application, data source, and process to power real-time orchestration at scale. With enterprise-grade security and continuous innovation at its core, Workato provides the trusted foundation for organizations to automate with confidence and operationalize AI across the business.

Requirements

  • Bachelor’s degree (or foreign equivalent) in Computer Science, Management, or a closely related field
  • 5 years of progressively responsible experience in the job offered or a related occupation
  • 3 years of experience with Rust, including Tokio, asynchronous programming, concurrency, performance optimization, and allocator profiling
  • 2 years of experience with Apache DataFusion and Apache Arrow, including Parquet, data pipelines, query planning, and vectorized execution
  • 3 years of experience creating integration tests with real dependencies using Docker and Testcontainers
  • 2 years of experience with behavior-driven testing for distributed services using frameworks such as Gherkin and Cucumber.
  • 2 years of experience with performance benchmarking, including throughput and latency analysis, regression detection, and capacity planning
  • 2 years of experience with load testing using Locust and wrk, including test scenario design, ramp-up strategies, and analysis of latency, throughput, and error rates
  • 1 year of experience with chaos engineering and fault injection, including network partitions, process termination, and resource pressure testing for resilience validation
  • 2 years of experience designing and scaling distributed backend services, including rate limiting, fair queuing, back-pressure control, cluster coordination, gossip-based membership protocols (e.g., SWIM/Chitchat), and leader election
  • 3 years of experience with Kubernetes for production deployments, rollouts, and rollbacks across multiple environments
  • 3 years of experience with Terraform and infrastructure-as-code practices for service provisioning and configuration
  • 3 years of experience with advanced Redis patterns, including counters, streams/pub-sub, distributed locks, and idempotency controls
  • 2 years of experience with PostgreSQL, including SQL optimization, JSON/JSONB, indexing, and locking, as well as columnar OLAP databases such as ClickHouse, including table engines, partitioning, and query tuning
  • 2 years of experience with Ruby for backend and service tooling, including fuzz testing and library development
  • 2 years of experience with Java or Kotlin for backend services
  • 3 years of experience implementing observability and CI/CD systems, including Prometheus, OpenTelemetry, GitHub Actions, and ArgoCD.
  • 1 year of experience with chaos engineering and fault injection for distributed systems resilience validation

Responsibilities

  • Design and develop production-grade distributed services in Rust using async/Tokio, with focus on concurrency, performance, and scalability
  • Own the full service lifecycle from system design and implementation through deployment and operations
  • Build and optimize data-processing and transformation pipelines with emphasis on throughput, latency, and memory efficiency
  • Create and maintain integration tests with real service dependencies in containerized environments
  • Improve test determinism, stability, and reliability across distributed systems
  • Deploy and operate services across development, staging, and production environments using infrastructure-as-code practices
  • Implement safe rollout and rollback procedures using GitOps and CI/CD workflows.
  • Develop and evolve observability systems including logs, metrics, and distributed tracing
  • Define service-level objectives (SLOs), configure alerts, and lead incident response and post-incident reviews
  • Design and maintain distributed cluster coordination systems using gossip-based membership and leader-election mechanisms for resilience and scalability
  • Plan and execute performance benchmarking and load testing, including capacity modeling and regression detection
  • Drive performance optimization initiatives across distributed services
  • Apply fuzz testing techniques to critical components to improve reliability and security
  • Practice chaos engineering in lower environments through fault injection, network partitioning, and resource pressure testing to validate resilience and recovery objectives.
  • Participate in architecture reviews and code reviews
  • Contribute to technical design documents and RFCs
  • Mentor peers and collaborate cross-functionally on service integrations and stateful components

Benefits

  • Full-time telecommuting permitted from anywhere in the United States
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service