Cloud SRE Engineer - Mandarin Bilingual

IntelliPro Group Inc.Palo Alto, CA
$70 - $100Onsite

About The Position

North America cloud operations team is looking for a skilled Cloud SRE Engineer to own the reliability, stability, and continuous improvement of core cloud services — spanning compute infrastructure (CVM/VMs), networking, and cloud security products. You'll work in a production-critical environment where operational excellence, deep technical expertise, and a self-directed mindset are essential. Since the North America team operates independently from teams in China and Singapore with no overlapping hours, we're looking for someone who can hit the ground running with minimal ramp-up time.

Requirements

  • Some SRE, DevOps, or cloud operations experience — ability to maintain application stability independently is essential given timezone constraints
  • Mandarin/English bilingual preferred — ability to communicate with teams in China and Singapore is a plus
  • Strong networking fundamentals (TCP/IP, DNS, HTTP, ICMP, load balancing, firewalls, VPC) OR deep Linux/CVM knowledge — ability to own either the networking or compute side of operations
  • Hands-on experience with cloud platforms (AWS, GCP, Azure, or equivalent) — deployment, usage, and high availability
  • Familiarity with Kubernetes and container-based deployments
  • Proficiency in at least one scripting language (Python, Shell, or Go) with automation experience
  • Strong troubleshooting and debugging skills across infrastructure layers
  • Experience with monitoring and alerting tools (Grafana, Prometheus, CloudWatch, or equivalent)
  • Bachelor's degree or above in Computer Science or a related field
  • Strong self-directed work ethic — able to operate independently with minimal supervision across time zones

Responsibilities

  • Monitor and maintain cloud compute (CVM), networking, and security products in the North America region to ensure high availability and system stability
  • Respond to and resolve production incidents, customer-reported issues, and system-level outages with urgency and ownership
  • Perform deep troubleshooting across network, compute, security, and platform layers
  • Participate in on-call rotation and handle live production issues independently
  • Deploy new features, bug fixes, and enhancements into production environments using CI/CD pipelines and internal tooling
  • Develop scripts and automation tools to improve operational efficiency and reduce toil
  • Build and improve monitoring, alerting, and disaster recovery systems for 24/7 operations
  • Document operational workflows, runbooks, and best practices
  • Work closely with R&D, security, and platform teams across time zones to drive service reliability
  • Communicate technical issues clearly to internal teams and B2B customers

Benefits

  • Comprehensive benefits package
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service