Senior Platform Engineer, Operator

DittoAustin, TX
Remote

About The Position

Ditto is redefining how data moves at the edge, enabling developers to build resilient, real-time applications regardless of network conditions. Their peer-to-peer sync engine ensures devices stay connected and data consistent, even without internet, and is trusted by organizations like Chick-fil-A, Delta Airlines, and the U.S. military. As a globally distributed, fast-growing startup with over $145 million in funding, Ditto powers mission-critical experiences across various sectors. The Senior Platform Engineer, Operator will be responsible for owning the architecture and evolution of the SMBP Operator's CRD API surface, designing flexible network configurations, building observability integrations, hardening security posture, ensuring deployability across enterprise Kubernetes landscapes, and developing operational tooling. This role also involves partnering with customer-facing teams and directly engaging enterprise customers to understand deployment requirements and resolve technical escalations.

Requirements

  • Production Rust experience — you are comfortable owning a large, multi-crate Rust workspace, navigating async Rust, and reasoning about lifetimes and ownership in a long-running reconciliation loop context.
  • Deep Kubernetes internals — you understand how controllers, reconciliation loops, watches, owner references, finalizers, admission webhooks, and status subresources actually work; you've debugged controllers in production, not just written YAML against them.
  • Kubernetes operator development — hands-on experience building or maintaining a real Kubernetes operator using controller-runtime, kube-runtime, kubebuilder, or an equivalent framework; you understand the operational difference between a true reconciliation controller and a Helm wrapper.
  • Kubernetes networking fluency — you have worked with Service types, Ingress, IngressClass, and at least one major ingress controller at the configuration level; you can reason about the trade-offs between Ingress, LoadBalancer, and Gateway API as exposure patterns.
  • StatefulSet management in production — you understand rolling updates, PVC lifecycle, pod disruption budgets, and the operational implications of topology changes in stateful systems; you've debugged a stuck StatefulSet in a real cluster.
  • Observability integration — you have instrumented Kubernetes controllers with Prometheus metrics, have opinions about what good operator telemetry looks like, and understand how ServiceMonitor CRDs and Prometheus Operator fit together.
  • Enterprise customer orientation — you've built infrastructure or tooling subject to enterprise security reviews, compliance requirements, and sophisticated operational expectations; you understand what 'enterprise-grade' means in practice, not just in principle.

Nice To Haves

  • Experience with kube-runtime specifically (Rust's Kubernetes controller framework)
  • Familiarity with Strimzi (Kafka operator) — the SMBP operator integrates with Strimzi for transaction log management
  • Cert-manager integration experience (Certificate, Issuer, ClusterIssuer resources and ACME lifecycle)
  • Property-based or generative testing experience, particularly proptest or similar frameworks in a Rust context
  • OLM and OpenShift operator certification experience — OperatorHub listing, bundle format, ClusterServiceVersion authoring
  • Kubernetes Gateway API experience (HTTPRoute, GRPCRoute, TCPRoute) as an implementer or operator
  • Supply chain security tooling: Cosign, SBOM generation (Syft, SPDX), container image signing workflows
  • Experience with distributed systems concepts relevant to CRDT-based sync, eventually consistent data stores, or peer-to-peer replication
  • Open source contributions to cloud-native infrastructure, Kubernetes ecosystem projects, or Rust systems tooling

Responsibilities

  • Own the architecture and evolution of the SMBP Operator's CRD API surface, designing extensions that meet enterprise expectations — consistent status conditions, configurable security contexts, fine-grained network configuration, and admission validation that surfaces bad configs at apply time rather than deep in the reconciliation loop.
  • Design and implement flexible network configuration patterns that give customers choice of ingress controller, load balancer, and traffic management approach — without prescribing specific Kubernetes ecosystem components or hardcoding controller-specific behaviour.
  • Build the observability integration enterprise customers expect: Prometheus-native metrics, ServiceMonitor CRDs for automatic scraping, Kubernetes Events emission throughout the reconciliation lifecycle, and pre-built dashboards that surface operator and cluster health without manual instrumentation.
  • Harden the operator's security posture across pod security contexts, network policies, supply chain integrity (image signing, SBOM), and the configuration surfaces that every enterprise security team will audit before approving a production deployment.
  • Make SMBP deployable across the full enterprise Kubernetes landscape — EKS, GKE, AKS, OpenShift, and air-gapped on-prem — including OLM support, OperatorHub listing, and the multi-distribution testing infrastructure that validates real upgrade and failure scenarios.
  • Build operational tooling that makes the operator supportable at scale: must-gather diagnostics, kubectl plugins, compatibility matrices, and documented upgrade paths — the infrastructure that lets support and customer engineering teams resolve issues without needing cluster access.
  • Partner with customer-facing teams and directly engage enterprise customers to understand deployment requirements, work through technical escalations, and ensure that what you build reflects how customers actually run Kubernetes in production.

Benefits

  • Competitive salaries and meaningful equity.
  • Benefits vary by region to make sure you're covered in the ways that matter most.
  • In the US, that includes health, dental, vision, life, and disability insurance, plus a 401(k) and flexible spending accounts.
  • Flexible time off.
  • Atlanta and San Francisco offices are open if you ever want a place to work or meet up with teammates.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

101-250 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service