DevOps Engineer

Reality DefenderNew York, NY
14d

About The Position

As a DevOps Engineer, you’ll design and operate the infrastructure that powers our production AI detection platform. You will own cloud architecture, reliability, deployment pipelines, and security posture—ensuring our systems can scale to global demand while remaining resilient, observable, and cost-efficient. This is a hands-on role with real ownership over platform decisions and technical direction. Responsibilities include: Architect and operate high-availability, low-latency cloud infrastructure for real-time AI inference services. Build and maintain CI/CD pipelines that support rapid, safe iteration across backend and ML systems. Own production reliability: monitoring, alerting, incident response, and capacity planning. Lead infrastructure automation using Infrastructure as Code (Terraform, Pulumi, etc.). Partner closely with backend and ML engineers to productionize new models and services. Harden our systems for security, compliance, and data protection. Optimize cost, performance, and scalability as usage grows. Establish best practices around observability, deployment strategies, and disaster recovery.

Requirements

  • 4+ years experience in DevOps, SRE, or infrastructure engineering roles.
  • Deep experience with at least one major cloud provider (AWS strongly preferred).
  • Strong production experience with: Kubernetes & container orchestration CI/CD systems (GitHub Actions, GitLab CI, etc.) Infrastructure as Code Linux & networking fundamentals
  • Proven track record supporting distributed, high-traffic systems.
  • Experience implementing monitoring, logging, and alerting (Prometheus, Grafana, Datadog, etc.).
  • Strong security mindset and experience with secrets management, IAM, and network isolation.

Responsibilities

  • Architect and operate high-availability, low-latency cloud infrastructure for real-time AI inference services.
  • Build and maintain CI/CD pipelines that support rapid, safe iteration across backend and ML systems.
  • Own production reliability: monitoring, alerting, incident response, and capacity planning.
  • Lead infrastructure automation using Infrastructure as Code (Terraform, Pulumi, etc.).
  • Partner closely with backend and ML engineers to productionize new models and services.
  • Harden our systems for security, compliance, and data protection.
  • Optimize cost, performance, and scalability as usage grows.
  • Establish best practices around observability, deployment strategies, and disaster recovery.

Benefits

  • Healthcare plans with 100% premium coverage for employees and partial coverage available for dependents
  • Dental and Vision plans with 100% premium coverage for employees and their dependents
  • Short/Long-term disability and life insurance plans with 100% premium coverage for employees
  • FSA/HSA and 401k programs
  • Equity compensation
  • 20 days of PTO per year
  • 12 weeks of Parental Leave
  • Learning and Development budget
  • Monthly wellness benefits
  • Annual company-sponsored offsite
  • Daily in-office lunch through UberEats
  • Commuter benefits
  • Remote Fridays
  • Happy Hours and other local events

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

11-50 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service