Cloud Operations Engineer

Avalon Administrative Services LLCTampa, FL
5hRemote

About The Position

The Cloud Operations Engineer is responsible for securing, automating, and operating Avalon Healthcare Solutions’ cloud infrastructure. This role supports the reliability, availability, and performance of business‑critical platforms by designing and administering secure AWS environments and applying Site Reliability Engineering (SRE) best practices. The Cloud Operations Engineer partners closely with engineering, security, and operations teams to ensure infrastructure stability, observability, and continuous improvement while supporting the organization’s mission to deliver high‑quality, cost‑effective diagnostic intelligence solutions. This position is eligible for remote work; however, quarterly travel to Avalon’s corporate office in Tampa, Florida may be required.

Requirements

  • Five or more years of experience in cloud engineering, network security, infrastructure operations, or SRE.
  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent experience.
  • Strong debugging and troubleshooting skills across cloud and network environments.

Nice To Haves

  • AWS certifications (Security, Solutions Architect, or Advanced Networking).
  • Experience with Infrastructure as Code (IaC) using AWS CloudFormation.
  • Proficiency with scripting languages such as PowerShell and JSON.
  • Strong working knowledge of Windows operating systems, Active Directory, and Microsoft 365.

Responsibilities

  • Design, configure, and maintain secure AWS network environments, including VPCs, routing, firewalls, NAT, VPNs, Transit Gateway, WAF, and API Gateways.
  • Implement and manage AWS security controls, including IAM roles, policies, and Security Groups.
  • Administer AWS services such as IAM, Systems Manager (SSM), EC2, ECS, AWS Fargate, and Lambda.
  • Monitor AWS infrastructure and applications using Grafana, OpenSearch, and Amazon CloudWatch.
  • Troubleshoot infrastructure, security, and network incidents; perform root cause analysis and remediation.
  • Apply Site Reliability Engineering (SRE) principles to improve reliability, automation, and observability.
  • Install, configure, and maintain business‑critical applications.
  • Create and maintain technical documentation including system diagrams, procedures, and runbooks.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service