Cloud Operations Engineer

Provenir
Hybrid

About The Position

As a Cloud Operations Engineer, you will play a crucial role in managing and supporting the infrastructure necessary for hosting our products on the AWS cloud. Your expertise in resolving product-related issues, coupled with your ability to automate tasks and optimize performance, will ensure our hosting operations are seamless and efficient. This is a 24/7 rotational shift role.

Requirements

  • 6-10 yrs of Industry experience
  • Strong Linux Skills: Overseeing the day-to-day operations of cloud-based applications running on production Linux environments, ensuring their stability, performance, and security. This includes patch management, performance tuning, and system monitoring., including familiarity with JVMs, heap dumps, system performance analysis, installations, configurations, upgrades, and proficient command-line usage.
  • Cloud Support Operations: Proven background in service operations roles, especially in daily customer interactions, technical issue resolution through ticket triaging, and independent RCA drafting.
  • SaaS and AWS Experience: Demonstrated experience with SaaS solutions and services, particularly managing enterprise applications on AWS cloud, focusing on availability and performance.
  • In-depth knowledge of AWS services like Storage, Databases, IAM, ECS, EKS, and CloudWatch.
  • Exceptional skills in diagnosing and resolving technical issues related to product hosting and infrastructure.
  • Experience with tools like Datadog, Splunk, Grafana, and Prometheus for cloud observability, monitoring, and alerting.
  • Experience managing cloud infrastructure upgrades and change management.
  • Proficiency in using tools like Jenkins, Terraform, and scripting/automation for infrastructure setup, automation, release, and task management.

Nice To Haves

  • Kubernetes and containerization technologies is highly desirable.

Responsibilities

  • Tackle technical challenges identified through monitoring tools or reported by customers, ensuring timely resolution of issues related to product hosting, infrastructure setup, networking, security, and more.
  • Work closely with cross-functional teams, product development, and clients to maintain high availability and performance.
  • Engage with cross-functional teams, product development, DevOps, and clients to comprehend their requirements, offer technical guidance, and address product hosting and infrastructure concerns.
  • Participate actively in incident, change, and problem management, adhering to ITIL best practices.
  • Establish and maintain essential infrastructure components and services for product hosting on multi-vendor cloud platforms.
  • Utilize tools such as GIT, Jenkins, Terraform, and scripting/automation to automate setup processes, enhancing deployment, configuration efficiency, and overall operational effectiveness.
  • Utilize cloud observability tools like DataDog, NewRelic, Splunk, Prometheus, etc., to monitor and enhance the hosting environment's performance and health.
  • Identify and resolve performance bottlenecks, ensuring optimal performance and availability.
  • Leverage scripting and automation tools to automate routine tasks, including backups, scaling, monitoring, and maintenance, thereby boosting operational efficiency and minimizing manual efforts.
  • Keep comprehensive documentation of infrastructure setups, configurations, and best practices.
  • Regularly report on infrastructure performance, issue resolutions, and automation efforts to stakeholders.

Benefits

  • comprehensive health and wellness plans
  • paid time off
  • company holidays
  • flexible and remote-friendly opportunities
  • maternity/paternity leave
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service