About The Position

Cloud Operations & Support Support Customer Self-Provision cloud instances across GCP, AWS, Azure and OCI with security guardrail and backend deployment. Monitor, troubleshoot, and resolve incidents, performance issues, and service outages in production and staging environments. Implement and maintain monitoring, alerting, and logging solutions to ensure high availability and reliability. Lead root cause analysis and post-mortem documentation for major incidents. Execute patch management, upgrades, and regular maintenance activities. Develop and maintain backup, disaster recovery, and failover strategies and operations. Participate in on-call rotation and after-hours support as required. Automation & Infrastructure Management Develop and maintain Infrastructure as Code (IaC) templates using tools such as Terraform, CloudFormation, ARM, or OCI Resource Manager. Use scripting (e.g., Python, Bash, PowerShell) to automate repetitive tasks and operational processes. Champion the use of configuration management tools and assist in DevOps pipeline integrations. Recommend and implement cost optimization, resource utilization, and rightsizing strategies. Ensure adherence to security best practices, including least-privilege access, encryption, and network segmentation. Implement and manage identity and access management (IAM) policies and roles. Monitor, identify, and remediate security vulnerabilities reported by scanning tools or external advisories. Support compliance efforts related to customer and regulatory requirements (TxRAMP, ISO, SOC2, etc.). Collaboration & Documentation Work closely with application, security, and network teams for solution delivery and support. Mentor junior engineers and provide technical guidance as needed. Create and update technical documentation, runbooks, and SOPs. Participate in client calls to provide technical input when required.

Requirements

  • Bachelor's degree (or equivalent experience) in Computer Science, IT, Engineering, or a related field.
  • 8+ years of hands-on experience in cloud engineering, operations, or support.
  • 7+ years multi-cloud experience (must have hands-on in at least 4 of AWS/Azure/GCP/OCI; familiarity in all is preferred; AWS and Azure cloud are mandatory).
  • Direct experience in managed services/NOC/SOC/MSP environments is a plus.
  • In-depth expertise with provisioning, configuring, securing, supporting, and optimizing cloud-native and hybrid workloads in GCP, AWS, Azure, and/or OCI.
  • Administration of compute, storage, networking, database, and PaaS services across supported platforms.
  • Deep hands-on knowledge in architecture, deployment, monitoring, and troubleshooting in major public cloud platforms (GCP, AWS, Azure, OCI).
  • Experience with CI/CD pipelines, containerization (Docker, Kubernetes), and automation tools (Terraform, CloudFormation, ARM templates, etc.).
  • Familiarity with cloud-native security best practices (IAM, network security, data encryption, etc.).
  • Proficiency in scripting languages (Python, Bash, PowerShell, etc.).
  • Expert in ServiceNow ITSM and familiar with integration with AWS, Azure, GCP and OCI in service catalog.
  • Expert in Cloud Cost Optimization.
  • Familiar with Apptio Cloudability.
  • Familiarity in AppGate SDP, Qualys TotalCloud, Qualys Patch Management, Qualys CSAM, CrowdStrike, Palo Alto NGFW, etc.
  • Ability to analyze logs and monitor performance using native tools (Stackdriver, CloudWatch, Azure Monitor, OCI Monitoring, etc.)
  • Strong understanding of backup strategy, disaster recovery, and high-availability architecture.
  • Be able to support customer remote file service in Azure File Share in Azure Gov Cloud (Azure Cloud Sr Engineer).
  • Have strong expertise in multi-cloud security compliance, data encryption, network security, user access control, private endpoint setup, etc.
  • Have strong expertise in GitHub and Repository management
  • Be able to set up rules/thresholds in GCP Monitoring, Azure Monitor, AWS CloudWatch and OCI Monitoring to generate alerts and connect with ServiceNow Incident Ticketing
  • Be able to connect multi-cloud VMs and instances with Microsoft Sentinel SIEM
  • Be able to support customer self-provision cloud instances with required security (guardrail) via Azure Blueprints, AWS Control Tower, etc.

Nice To Haves

  • Google Professional Cloud Architect / Engineer
  • AWS Certified Solutions Architect / SysOps Administrator
  • Microsoft Certified: Azure Administrator Associate or Solutions Architect Expert
  • Oracle Cloud Infrastructure Architect Associate/Professional (Preferred)
  • DevOps or automation certifications (e.g., Kubernetes, Terraform). (Preferred)
  • ITIL Foundation or other support framework knowledge.

Responsibilities

  • Support Customer Self-Provision cloud instances across GCP, AWS, Azure and OCI with security guardrail and backend deployment.
  • Monitor, troubleshoot, and resolve incidents, performance issues, and service outages in production and staging environments.
  • Implement and maintain monitoring, alerting, and logging solutions to ensure high availability and reliability.
  • Lead root cause analysis and post-mortem documentation for major incidents.
  • Execute patch management, upgrades, and regular maintenance activities.
  • Develop and maintain backup, disaster recovery, and failover strategies and operations.
  • Participate in on-call rotation and after-hours support as required.
  • Develop and maintain Infrastructure as Code (IaC) templates using tools such as Terraform, CloudFormation, ARM, or OCI Resource Manager.
  • Use scripting (e.g., Python, Bash, PowerShell) to automate repetitive tasks and operational processes.
  • Champion the use of configuration management tools and assist in DevOps pipeline integrations.
  • Recommend and implement cost optimization, resource utilization, and rightsizing strategies.
  • Ensure adherence to security best practices, including least-privilege access, encryption, and network segmentation.
  • Implement and manage identity and access management (IAM) policies and roles.
  • Monitor, identify, and remediate security vulnerabilities reported by scanning tools or external advisories.
  • Support compliance efforts related to customer and regulatory requirements (TxRAMP, ISO, SOC2, etc.).
  • Work closely with application, security, and network teams for solution delivery and support.
  • Mentor junior engineers and provide technical guidance as needed.
  • Create and update technical documentation, runbooks, and SOPs.
  • Participate in client calls to provide technical input when required.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service