Senior Cloud Support Engineer - (Onsite)

Shuvel DigitalSan Antonio, TX
3dOnsite

About The Position

As a Senior Support Engineer in the Cloud Operations team, you will provide expert-level technical support for mission-critical systems and applications while ensuring the stability, reliability, and scalability of cloud infrastructure. You will play a key role in incident management, root cause analysis, automation, and continuous improvement, leveraging SRE principles and cloud best practices to drive operational excellence.

Requirements

  • 10+ years of technical support experience in enterprise environments.
  • AWS Solutions Architect or Certified Kubernetes Administrator (CKA) certification required.
  • Strong expertise in monitoring tools (Datadog, Nagios, Prometheus, AWS CloudWatch, Splunk).
  • Proficiency in scripting and automation (Python, PowerShell, Bash).
  • Experience with cloud networking, security, IAM policies, and infrastructure optimization.
  • Familiarity with CI/CD pipelines, DevOps methodologies, and infrastructure as code (Terraform, CloudFormation).
  • Strong analytical thinking and problem-solving skills with a proactive mindset.
  • Excellent communication skills and ability to collaborate effectively across teams.

Nice To Haves

  • ITIL Foundation certification (preferred).
  • Hands-on experience with ServiceNow or similar ITSM platforms.

Responsibilities

  • Resolve complex technical incidents related to AWS infrastructure, networking, and applications within SLA targets.
  • Perform root cause analysis (RCA) and implement long-term solutions to prevent recurring issues.
  • Monitor system health using Datadog, Prometheus, AWS CloudWatch, and Splunk, responding proactively to alerts.
  • Automate operational tasks and incident response using Python, PowerShell, or Bash scripting.
  • Optimize AWS resources, configurations, and cost efficiency, ensuring reliability and security.
  • Collaborate with DevOps and engineering teams to enhance CI/CD pipelines and automate deployments.
  • Maintain operational runbooks, SOPs, and knowledge base articles for efficient troubleshooting.
  • Mentor junior engineers and drive continuous service improvement through SRE best practices.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service