Site Reliability Engineer | Growth and Transformation

Red Ventures•Charlotte, NC

99d•$100,000 - $145,000

About The Position

The Growth and Transformation team at Red Ventures is seeking a Site Reliability Engineer (SRE) to ensure our platforms and applications are resilient, scalable, and perform at lightning speed. You’ll work across AWS, GCP, and Kubernetes environments, helping us meet ambitious 99.99% uptime goals while driving automation, observability, and performance improvements at scale. This isn’t a reactive role — you’ll be empowered to design reliability into our systems from the start, partner with engineers across the org, and continually improve how we build, monitor, and operate mission-critical services. This role requires a hybrid schedule and will be based in our South Charlotte, NC Headquarters (Tuesday through Thursday) and work fully remotely on Mondays and Fridays each week.

Requirements

3–5 years in SRE, DevOps, or cloud infrastructure engineering.
Strong experience with AWS, GCP, and Kubernetes orchestration.
Skilled in infrastructure as code (Terraform).
Proficient in observability and monitoring tools (New Relic, Grafana, OpenTelemetry).
Familiar with CI/CD pipelines, automated deployments, and scripting (Python, Bash, Go, etc.).
Experience maintaining high-availability systems (99.9%+).
Strong grasp of distributed systems, microservices, and scalability patterns.
Incident response and troubleshooting experience with a focus on learning from failures.
Excellent communication and collaboration skills.

Nice To Haves

Certifications (AWS Solutions Architect, GCP Professional Cloud Architect).
Experience with chaos engineering, resilience testing, or load balancing at scale.
Familiarity with Salesforce or Adobe ecosystems.
Database performance tuning expertise.
Exposure to log aggregation tools (ELK, Splunk).
Strong knowledge of cloud security and multi-region networking.

Responsibilities

Ensure system reliability and performance across multi-cloud, multi-region platforms.
Build and maintain observability solutions (OpenTelemetry, New Relic, Grafana) for real-time insights.
Automate infrastructure and deployments with Terraform and custom tooling.
Lead and participate in incident response, troubleshooting issues, restoring service quickly, and driving root cause analysis.
Define and manage SLOs/SLIs that hold us accountable to business-critical SLAs.
Scale infrastructure capacity to meet growth and traffic demands.
Partner with developers to embed reliability best practices into application design and delivery.
Manage and optimize Kubernetes clusters across AWS and GCP.
Contribute to architecture reviews with a focus on reliability and scalability.
Foster a culture of continuous improvement, experimentation, and operational excellence.

Benefits

Health Insurance Coverage (medical, dental, and vision).
Life Insurance.
Short and Long-Term Disability Insurance.
Flexible Spending Accounts.
Holiday Pay.
401(k) with match.
Employee Assistance Program.
Paid Parental Bonding Benefit Program.
Flexible Paid Time Off (PTO): 20 days of PTO for a full calendar year, increasing to 25 days after five years of service.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

1,001-5,000 employees

Site Reliability Engineer | Growth and Transformation

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company