Senior Data Infrastructure

Judgment Labs•San Francisco, CA

27d•Onsite

About The Position

We are Judgment. We build infrastructure for Agent Behavior Monitoring (ABM): surfacing silent behavioral issues, understanding how agents behave in production, and turning interaction data into actionable signals. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. When something breaks, they’re not stuck in reactive incident triage. They can see which behaviors are trending, which configurations caused regressions, and what to actually fix. We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: We are looking for a Senior Data Infrastructure Engineer to build and scale the real-time data pipelines that power agent behavior analysis at production scale. This role is crucial for processing hundreds of thousands of traces per second, running LLM-based scoring and clustering in near-real time, and delivering the low-latency query performance that enables teams to understand agent behavior as it happens. We need someone who has built petabyte-scale data systems, knows how to squeeze performance out of OLAP databases, and can own the data infrastructure from ingestion through analytics.

Requirements

Experience building and tuning high-throughput Petabyte-scale data pipelines
Deep knowledge of data infrastructure (Apache Spark, Ray, dbt, Airflow/Dagster)
Experience with OLAP database engineering
Comfortable with cloud infrastructure and batch + streaming pipelines
Senior-level ownership: you will own infrastructure roadmap, architecture design, set practices, identify bottlenecks, ship fixes.

Nice To Haves

Experience working with LLM Inference and Serving optimization techniques such as: Speculative Decoding Continuous batching and dynamic batching strategies KV cache optimization and management Quantization techniques (INT8, INT4) for reduced memory footprint Multi-GPU serving and tensor parallelism

Responsibilities

Design the streaming pipeline that scores and clusters 100k+ traces/s workload using LLM APIs in near-real time (Kafka + Spark/Ray).
Identify LLM API Serving bottleneck via looking at flamegraphs and raise RPS via smart batching/streaming, adaptive concurrency, and connection pooling.
Speedup Clickhouse Database query, reduce p95/p99 for queries with better schemas/partitions, projections/materialized views, and tiered storage.