Production Reliability Engineer

Friedman Vartolo LLPVillage of Garden City, NY
$50,000 - $70,000Hybrid

About The Position

We are seeking a Production Reliability Engineer to own the operational health of our deployed internal application portfolio. The portfolio is growing rapidly - many new applications shipping every week - and we need a dedicated owner ensuring that what we deploy stays reliable, secure, and performant over its lifetime, and that the operational footprint stays manageable as the portfolio grows. This is a hands-on role for an engineer who thinks systematically about reliability, communicates clearly during incidents, and knows when to retire a deployed application that has outlived its usefulness. You will be the source of truth for "is our portfolio healthy" and the decision-maker on what gets maintained, what gets fixed, and what gets retired. You will work closely with the Senior Technology Manager, a DevOps Engineer (counterpart role), and a group of Production Engineers who own initial deployment and 30-day iteration of each application.

Requirements

  • 5+ years professional experience in Site Reliability Engineering, DevOps, production engineering, or equivalent
  • Strong incident-response experience - leading incidents, not just participating; comfortable owning the on-call rotation
  • Experience defining and tracking SLOs and error budgets for production systems
  • Strong Azure cloud experience (or directly comparable AWS / GCP experience)
  • Hands-on experience with monitoring, alerting, and observability tooling (Azure Monitor, Application Insights, Datadog, New Relic, or equivalent)
  • Comfortable reading code in multiple languages (TypeScript, Python, C# preferred) for incident triage and root-cause analysis
  • Understanding of cloud security fundamentals - patching cadence, dependency management, secret rotation, base image hygiene
  • Excellent written communication for incident comms, postmortems, and portfolio reporting

Nice To Haves

  • Experience managing a portfolio of small-to-medium applications, rather than a single monolith
  • Experience with security tooling (Huntress, Microsoft Defender, Vanta, similar)
  • Familiarity with Databricks, data warehouses, or analytics platforms
  • Experience driving retire/deprecate decisions and the conversations that accompany them
  • Experience in regulated industries (legal, financial services, healthcare)
  • Familiarity with compliance frameworks (SOC2, ISO 27001) from an operational-evidence perspective

Responsibilities

  • Own the operational health of the firm's deployed internal application portfolio
  • Lead incident response for production issues; serve as on-call rotation lead
  • Manage security patching, dependency updates, base image refreshes, and certificate rotation across the deployed portfolio
  • Conduct monthly triage of every deployed application - usage levels, error rates, security posture, retire/keep recommendation
  • Maintain the application retirement queue and drive retire/deprecate decisions for low-usage or obsolete applications
  • Define and track Service Level Objectives (SLOs) for application uptime, performance, and reliability
  • Partner with the DevOps Engineer to ensure deployment pipelines incorporate reliability requirements from day one
  • Produce a weekly portfolio health report - uptime trends, open incidents, security posture, retire/deprecate decisions
  • Conduct incident postmortems and drive remediation actions to completion
  • Contribute to security and compliance posture work (SOC2 readiness, Vanta evidence collection) as it relates to operational reliability

Benefits

  • Paid parental leave options
  • Short and long-term disability leave options
  • Comprehensive medical, dental, and vision insurance
  • Flexible Spending Accounts (FSA) and Dependent Care plans
  • Commuter benefits including transit, and parking options
  • Pet insurance to help care for your furry family members
  • 401(k) retirement plans with employer contributions
  • Gym and fitness reimbursements to support a healthy lifestyle
  • Annual business expense reimbursements for attorneys
  • Annual $500 travel budget to attend firm sponsored social events
  • Hybrid work flexibility available after 90 days of employment
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service