About The Position

Cloud Operations & Support Provision, configure, and maintain cloud infrastructure across AWS, Azure, GCP, and OCI. Monitor, troubleshoot, and resolve incidents, performance issues, and service outages in production and staging environments. Implement and maintain monitoring, alerting, and logging solutions to ensure high availability and reliability . Lead root cause analysis and post-mortem documentation for major incidents. Execute patch management, upgrades, and regular maintenance activities. Develop and maintain backup, disaster recovery, and failover strategies and operations. Participate in on-call rotation and after-hours support as required. Automation & Infrastructure Management Develop and maintain Infrastructure as Code (IaC) templates using tools such as Terraform, CloudFormation, ARM, or OCI Resource Manager. Use scripting (e.g., Python, Bash, PowerShell) to automate repetitive tasks and operational processes. Champion the use of configuration management tools and assist in DevOps pipeline integrations. Recommend and implement cost optimization, resource utilization, and rightsizing strategies. Security & Compliance Ensure adherence to security best practices, including least-privilege access, encryption, and network segmentation. Implement and manage identity and access management (IAM) policies and roles. Monitor, identify, and remediate security vulnerabilities reported by scanning tools or external advisories. Support compliance efforts related to customer and regulatory requirements (TxRAMP, ISO, SOC2, etc.). Collaboration & Documentation Work closely with application, security, and network teams for solution delivery and support. Mentor junior engineers and provide technical guidance as needed. Create and update technical documentation, runbooks, and SOPs. Participate in client calls to provide technical input when required.

Requirements

  • Bachelor's degree in Computer Science, Engineering, Information Systems, or a related discipline; or equivalent professional experience.
  • At least two of the following certifications (or equivalent experience): AWS Certified Solutions Architect / SysOps Administrator Microsoft Certified: Azure Administrator Associate or Solutions Architect Expert Google Professional Cloud Architect / Engineer Oracle Cloud Infrastructure Architect Associate/Professional (Preferred)
  • 5+ years of hands-on experience in cloud engineering, operations, or support .
  • 3+ years multi-cloud experience (must have hands-on in at least 2 of AWS/Azure/GCP/OCI; familiarity in all is preferred; AWS and Azure cloud are mandatory).
  • Direct experience in managed services/NOC/SOC/MSP environments is a plus.
  • In-depth expertise with provisioning, configuring, securing, supporting, and optimizing cloud-native and hybrid workloads in AWS, Azure, GCP, and/or OCI.
  • Administration of compute, storage, networking, database, and PaaS services across supported platforms.

Nice To Haves

  • DevOps or automation certifications (e.g., Kubernetes, Terraform, Ansible). (Preferred)
  • ITIL Foundation or other support framework knowledge.

Responsibilities

  • Provision, configure, and maintain cloud infrastructure across AWS, Azure, GCP, and OCI.
  • Monitor, troubleshoot, and resolve incidents, performance issues, and service outages in production and staging environments.
  • Implement and maintain monitoring, alerting, and logging solutions to ensure high availability and reliability
  • Lead root cause analysis and post-mortem documentation for major incidents.
  • Execute patch management, upgrades, and regular maintenance activities.
  • Develop and maintain backup, disaster recovery, and failover strategies and operations.
  • Participate in on-call rotation and after-hours support as required.
  • Develop and maintain Infrastructure as Code (IaC) templates using tools such as Terraform, CloudFormation, ARM, or OCI Resource Manager.
  • Use scripting (e.g., Python, Bash, PowerShell) to automate repetitive tasks and operational processes.
  • Champion the use of configuration management tools and assist in DevOps pipeline integrations.
  • Recommend and implement cost optimization, resource utilization, and rightsizing strategies.
  • Ensure adherence to security best practices, including least-privilege access, encryption, and network segmentation.
  • Implement and manage identity and access management (IAM) policies and roles.
  • Monitor, identify, and remediate security vulnerabilities reported by scanning tools or external advisories.
  • Support compliance efforts related to customer and regulatory requirements (TxRAMP, ISO, SOC2, etc.).
  • Work closely with application, security, and network teams for solution delivery and support.
  • Mentor junior engineers and provide technical guidance as needed.
  • Create and update technical documentation, runbooks, and SOPs.
  • Participate in client calls to provide technical input when required.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service