Senior Site Reliability Engineer

GreenLiteNew York, NY
110d

About The Position

GreenLite is revolutionizing development in America by streamlining the collaboration between developers, builders, and local regulatory authorities. GreenLite’s software powers its Private Plan Review offering, serving many of the nation’s largest public retailers, developers, and production home builders. By leveraging GreenLite’s technology, its customers save months on each project, significantly accelerating their timelines and staying within budget. As our first dedicated SRE, you will establish the patterns, tooling, and culture that keep our systems fast, observable, and resilient while we 10x traffic over the next 18 months. Our operating principles—Winning Mentality, Speed & Urgency, Disagree & Commit, Ownership & Integrity, Customer Centricity—are not wall art; they guide hiring, architecture, and on‑call decisions.

Requirements

  • 6+ yrs building and operating production systems in AWS, GCP or Azure (AWS preferred).
  • Demonstrated ownership of SLOs, incident response and post‑incident analysis.
  • Expert in IaC (Terraform, CDK, Pulumi) and container orchestration (ECS, EKS or K8s).
  • Proficient with at least one modern language (Python, Rust, Go) and strong bash skills.
  • Deep familiarity with observability stacks (Datadog, Grafana, Prometheus, OTEL).
  • Track record of raising the bar for security, compliance and cost optimisation.

Nice To Haves

  • Experience with infrastructure for ML workflows (model training, feature stores).
  • Prior work in construction‑tech, gov‑tech or other regulated domains.
  • Certification: AWS Solutions Architect or DevOps Pro.
  • Experience introducing chaos engineering or game‑days.
  • Public track record (blog posts, OSS) advancing the SRE discipline.
  • Leadership in defining hiring/on‑call processes at a high‑growth startup.

Responsibilities

  • Design & harden production infrastructure AWS ECS/Fargate via AWS Copilot (migrating to Terraform), RDS/Postgres, S3, EventBridge, Bedrock.
  • Lead reliability engineering: SLO/SLA definition, error‑budget policies, capacity planning and load testing ahead of major launches.
  • Own CI/CD: advance our GitHub Actions pipeline, introduce progressive delivery and automated rollbacks to steadily maintain & improve deployment frequency and lead time for changes.
  • Instrument & Observe: deploy metrics, tracing and logging (Datadog) and drive an on‑call culture focused on MTTR and learning reviews, not blame.
  • Security & compliance: partner with the engineers to automate patching, secrets management & rotation, least‑privilege IAM and SOC 2 controls.
  • Coach & collaborate: mentor engineers on SRE best practices, work closely with ML and product squads, and influence architecture decisions through strong opinions loosely held.
  • Continuously improve: identify systemic bottlenecks, build tooling that eliminates toil and scale our platform without scaling pager fatigue.

Benefits

  • Competitive Compensation - Generous base salary & access to our Employee Equity Program.
  • Performance-Based Annual Bonuses - Rewards for high-impact results and contributions.
  • Premium Health Coverage - Comprehensive medical, dental, and vision insurance for full-time team members.
  • 401(k) Retirement Plan - Helping you invest in your future with smart saving options.
  • Parental Leave - Generous parental leave for all parents.
  • Wellness Support - Monthly Wellness Stipend and full access to Wellhub, Talkspace, & Teladoc.
  • Weekly Team Lunches - Enjoy catered lunches every week in our NYC office.
  • Company-Wide Team All Hands - Held twice a year, fostering transparency, alignment, and inspiration.
  • Team-Building Events - Regular opportunities to connect, collaborate, and celebrate as a team.
  • Unlimited PTO - Flexible time off so you can recharge, travel, or take care of life as needed.
  • Hybrid Work Environment – In-office 4 days per week, switching to a 3-day schedule in summer.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service