Product Reliability Engineer

PointOneNew York City, NY
Onsite

About The Position

PointOne builds infrastructure for the legal industry, powering timekeeping and billing systems used by law firms and government agencies. We’re a venture‑backed startup (Y Combinator, Bessemer, 8VC, General Catalyst) made up of software engineers (Jane Street, Google, Stanford, Princeton) and ex-attorneys. To keep up with inbound customer demand, we are quickly scaling our engineering team following a $16M Series A. We process the most confidential data for institutions working on the most sensitive matters—and our customers depend on us being up, accurate, and fast, always. We're hiring a Product Reliability Engineer to own the health, stability, and observability of our systems end-to-end. The Role You are the connective tissue between our customers and our product. You’re a product engineer focused on a key dimension of the user experience: reliability. When reliability is threatened, you'll work directly with customers to understand impact, stop the bleeding, create a robust fix, and then close the loop. However, this role isn't only reactive. The best PREs use front-line signals to make the whole system more resilient: better observability, fewer recurring failures, and proactive investments that get ahead of problems before customers feel them. This is a hands-on, full-stack engineering role at the intersection of product, infrastructure, and customer impact.

Requirements

  • 2+ years of software engineering experience, with meaningful time spent in reliability, platform, or production-facing roles
  • Strong debugging instincts and comfort tracing failures across distributed systems using logs, traces, and metrics
  • Hands-on experience with AWS (Lambda, SQS, RDS, CloudWatch or equivalent)
  • Comfortable reading and writing Go, TypeScript, or similar backend languages
  • Experience building or improving observability infrastructure (alerting, dashboards, telemetry)
  • High ownership mentality: you close the loop, you write the postmortem, you ship the fix

Nice To Haves

  • Experience in legaltech, fintech, healthtech, or other high-sensitivity, always-on environments.

Responsibilities

  • Respond quickly to automated alerts and customer-reported issues
  • Triage, diagnose, and resolve production incidents with a bias toward permanent fixes over workarounds
  • Build and maintain incident response playbooks and postmortem processes
  • Coordinate cross-functionally with customer success managers and key account stakeholders to maintain customer trust in the event of an incident
  • Design and instrument telemetry, logging, and alerting across our serverless AWS stack
  • Build dashboards and health metrics that surface issues before customers feel them
  • Identify recurring failure patterns and drive systemic fixes into the codebase
  • Reduce operational toil through automation
  • Contribute directly to the codebase—improving resilience, reducing tech debt, and creating automation to ensure bugs are resolved quickly and with little human intervention
  • Partner with engineers on new feature launches to assess reliability risks before they ship
  • Make data-driven recommendations on where to invest in stability

Benefits

  • comprehensive health, dental, and vision insurance
  • meals in office
  • regular team events
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service