Site Reliability Engineer | Growth and Transformation

Red VenturesCharlotte, NC
99d$100,000 - $145,000

About The Position

The Growth and Transformation team at Red Ventures is seeking a Site Reliability Engineer (SRE) to ensure our platforms and applications are resilient, scalable, and perform at lightning speed. You’ll work across AWS, GCP, and Kubernetes environments, helping us meet ambitious 99.99% uptime goals while driving automation, observability, and performance improvements at scale. This isn’t a reactive role — you’ll be empowered to design reliability into our systems from the start, partner with engineers across the org, and continually improve how we build, monitor, and operate mission-critical services. This role requires a hybrid schedule and will be based in our South Charlotte, NC Headquarters (Tuesday through Thursday) and work fully remotely on Mondays and Fridays each week.

Requirements

  • 3–5 years in SRE, DevOps, or cloud infrastructure engineering.
  • Strong experience with AWS, GCP, and Kubernetes orchestration.
  • Skilled in infrastructure as code (Terraform).
  • Proficient in observability and monitoring tools (New Relic, Grafana, OpenTelemetry).
  • Familiar with CI/CD pipelines, automated deployments, and scripting (Python, Bash, Go, etc.).
  • Experience maintaining high-availability systems (99.9%+).
  • Strong grasp of distributed systems, microservices, and scalability patterns.
  • Incident response and troubleshooting experience with a focus on learning from failures.
  • Excellent communication and collaboration skills.

Nice To Haves

  • Certifications (AWS Solutions Architect, GCP Professional Cloud Architect).
  • Experience with chaos engineering, resilience testing, or load balancing at scale.
  • Familiarity with Salesforce or Adobe ecosystems.
  • Database performance tuning expertise.
  • Exposure to log aggregation tools (ELK, Splunk).
  • Strong knowledge of cloud security and multi-region networking.

Responsibilities

  • Ensure system reliability and performance across multi-cloud, multi-region platforms.
  • Build and maintain observability solutions (OpenTelemetry, New Relic, Grafana) for real-time insights.
  • Automate infrastructure and deployments with Terraform and custom tooling.
  • Lead and participate in incident response, troubleshooting issues, restoring service quickly, and driving root cause analysis.
  • Define and manage SLOs/SLIs that hold us accountable to business-critical SLAs.
  • Scale infrastructure capacity to meet growth and traffic demands.
  • Partner with developers to embed reliability best practices into application design and delivery.
  • Manage and optimize Kubernetes clusters across AWS and GCP.
  • Contribute to architecture reviews with a focus on reliability and scalability.
  • Foster a culture of continuous improvement, experimentation, and operational excellence.

Benefits

  • Health Insurance Coverage (medical, dental, and vision).
  • Life Insurance.
  • Short and Long-Term Disability Insurance.
  • Flexible Spending Accounts.
  • Holiday Pay.
  • 401(k) with match.
  • Employee Assistance Program.
  • Paid Parental Bonding Benefit Program.
  • Flexible Paid Time Off (PTO): 20 days of PTO for a full calendar year, increasing to 25 days after five years of service.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service