Senior Cloud Operations Engineer

Sage GroupAtlanta, GA
56dHybrid

About The Position

We are seeking a reliable and forward-thinking Senior Cloud Operations Engineer to help us scale and stabilize the cloud infrastructure that powers our growing suite of SaaS products. In this role, you will be responsible for owning, maintaining, and improving production and development environments across AWS and Azure - ensuring they are highly available, secure, and performant. This is a hybrid role: 3 days per week onsite in Atlanta You'll collaborate with engineering, platform, and release teams to deliver operational excellence across all lifecycle stages. As our infrastructure footprint grows rapidly, we rely on well-defined processes and standards to ensure we move fast without sacrificing reliability. This role is ideal for someone who thrives on owning infrastructure, solving complex problems, and building systems that are resilient, observable, and release ready.

Requirements

  • 5+ years in Cloud Operations, Site Reliability Engineering, or Systems Engineering roles in cloud-first or SaaS environments.
  • Deep experience operating and troubleshooting infrastructure on AWS (EC2, S3, RDS, CloudFront, Lambda) and/or Azure.
  • Expert in Linux system administration (RedHat or Debian), with strong fluency in networking, web stack configuration, and OS-level tuning.
  • Proficient in Python and Bash scripting with a strong automation mindset.
  • Hands-on experience with infrastructure-as-code using Terraform, Ansible, or CloudFormation.
  • Proven success in diagnosing and resolving high-impact production issues in distributed systems.
  • Familiarity with modern monitoring and observability stacks (e.g., Prometheus, ELK, Zabbix).
  • Comfortable supporting and optimizing web technologies such as NGINX, HAProxy, Apache, or Tomcat.
  • Strong communication skills and the ability to work effectively within a fast-paced, highly collaborative team environment.

Nice To Haves

  • Experience with Docker, Kubernetes, or container-based infrastructure.
  • Familiarity with change and release readiness frameworks, including validation, monitoring, rollback, and support requirements.
  • Previous experience improving the quality and efficiency of on-call practices or production runbooks.
  • Working knowledge of CI/CD pipelines and release automation tooling.
  • A structured, systems-oriented thinker with a passion for operational excellence, process discipline, and scaling infrastructure through automation.

Responsibilities

  • proactively supporting the health of core services
  • responding to incidents
  • implementing automation
  • ensuring our systems evolve with growing scale and complexity

Benefits

  • Competitive salaries that landed us top 5% of similar sized companies (according to Comparably)
  • Comprehensive health, dental and vision coverage
  • 401(k) retirement match (100% matching up to 4%)
  • 32 days paid time off (21 personal days, 10 national holidays, 1 floating holiday)
  • 18 weeks paid parental leave for birth, adoption or surrogacy offered 1 year after start date
  • 5 days paid yearly to volunteer (through Sage Foundation)
  • $5,250 tuition reimbursement per calendar year starting 6 months after hire date
  • Sage Wellness Rewards Program ($600 wellness credit and $360 fitness reimbursement annually)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Professional, Scientific, and Technical Services

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service