Lead DevOps/SRE Engineer

Launch PotatoJersey City, NJ
Remote

About The Position

Launch Potato is a profitable digital media company that reaches over 30M+ monthly visitors through brands such as FinanceBuzz, All About Cookies, and OnlyInYourState. As The Discovery and Conversion Company, our mission is to connect consumers with the world’s leading brands through data-driven content and technology. Headquartered in South Florida with a remote-first team spanning over 15 countries, we’ve built a high-growth, high-performance culture where speed, ownership, and measurable impact drive success. At Launch Potato, you’ll accelerate your career by owning outcomes, moving fast, and driving impact with a global team of high-performers.

Requirements

  • 5+ years of production AWS infrastructure experience with deep Terraform expertise.
  • Hands-on experience building the SRE function from scratch and had complete ownership.
  • Experience with a multi-site company where PaaS or microservices are required.
  • CI/CD pipeline ownership in one or more previous roles.
  • PagerDuty experience and standing up an on-call rotation.
  • 5+ years hands-on with AWS, Terraform, CI/CD pipeline ownership, and SRE tooling (OpenTelemetry, Grafana, PagerDuty or equivalent) in a production environment.
  • Ownership orientation: You don't wait to be assigned a problem. If something is broken, undocumented, or a risk, you flag it and fix it. If the runbooks don't exist yet, you write them.
  • Documentation discipline: You write things down. Runbooks, decision rationale, architecture patterns, incident post-mortems. The next person should be able to understand your work without asking you.
  • Cost consciousness: You think about the business impact of infrastructure decisions. You can explain a spending anomaly to a CFO in plain language. You know what things cost before you build them.
  • Calm under pressure: Production incidents happen. You triage clearly, communicate proactively with technical and non-technical stakeholders, and run a tight post-mortem without blame. You've been woken up at 3am. You can handle it.
  • Cross-functional communication: You can work with product engineers, legal/compliance, and executive leadership in the same week without switching communication modes awkwardly. You speak both engineer and business.
  • Proactive reliability: A good SRE reacts to outages. A great SRE catches degradation before it becomes an outage. You build alerting against the patterns, not just the failures.

Responsibilities

  • Own and evolve Launch Potato's cloud infrastructure, CI/CD platform, and compliance posture.
  • Build the SRE function from the ground up so product teams can ship faster without compromising reliability, security, or cost control.
  • Stand up the SRE practice from scratch: on-call rotation, PagerDuty configuration, SLA/SLO definitions for core infrastructure services, runbook library, and observability dashboards that tie site performance to business metrics.
  • Complete the AWS multi-account migration: move production workloads to an isolated account with zero unplanned downtime.
  • Deliver SOC 2 Type I audit-ready infrastructure evidence package: own the technical controls implementation end-to-end.
  • Version and publish the Terraform module library: (30+ modules) to a private registry to eliminate ad hoc git consumption by product teams.
  • Implement automated deployment rollback for ECS and Lambda: gate production on integration test passage.
  • Stand up monthly cost reporting to leadership: budget anomaly detection, savings plan recommendations, spend by service/team/environment.

Benefits

  • profit-sharing bonus
  • competitive benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service