Senior Cloud/Backend Engineer

Triple WhaleNew York, NY
Remote

About The Position

Triple Whale is seeking a Senior Cloud Backend Engineer to join their Infrastructure Team. The ideal candidate will enjoy solving complex problems, working in detail, assisting other developers, and taking ownership of systems from beginning to end. This role is crucial for maintaining the reliability of the platform and includes participation in a weekend on-call rotation. The company is looking for individuals passionate about building reliable, scalable systems and who thrive on open-ended challenges, enjoying the design, build, and ownership of complex infrastructure. Triple Whale is a leading intelligence platform for ecommerce, empowering brands to understand and drive growth with confidence. By consolidating data, providing trusted measurement tools, and leveraging advanced AI, Triple Whale transforms fragmented data into clear insights and actionable recommendations. Their AI agents and automations enhance creative asset generation, optimize marketing channels, and improve the effectiveness of various tools. Over 60,000 ecommerce brands rely on Triple Whale to accelerate growth and revenue efficiently, uncovering opportunities and executing them at an unprecedented scale.

Requirements

  • Located in the New York tri-state area
  • 3+ years of experience as an independent backend or infrastructure engineer
  • Ability to design and build scalable, reliable systems
  • Strong communication skills
  • Hands-on builder mentality — this is a coding role
  • Experience with relational and non-relational databases
  • Experience with major Cloud platforms (GCP, AWS, Azure)
  • Experience with streaming systems
  • Experience scaling large systems
  • Experience with message queues
  • Experience with monitoring systems like DataDog, Grafana, Groundcover
  • Experience with CI/CD, Git
  • Previous real-world experience in production on-call environments
  • Ability to quickly assess incidents, identify scope/root cause, and understand platform impact
  • Ability to classify severity and prioritize response appropriately
  • Capability to deploy safe production hotfixes when needed
  • Solid judgment under pressure - especially when operating independently
  • Ownership mindset: from detection to mitigation to resolution

Nice To Haves

  • GCP is an advantage
  • Kubernetes and Knative (production experience)
  • ClickHouse

Responsibilities

  • Deploy and support service infrastructure in Kubernetes
  • Identify and build the right tools and technologies for major initiatives
  • Assist other teams and developers in designing robust and scalable systems
  • Scale and optimize multiple databases
  • Build internal tooling to accelerate developer velocity
  • Provide observability, monitoring, and visibility across systems
  • Participate in a shared on-call rotation (typically 2-3 times per month, Friday 7:00 AM ET through Saturday 5:00 PM ET)
  • Assess incidents, identify scope/root cause, and understand platform impact
  • Classify severity and prioritize response appropriately
  • Deploy safe production hotfixes when needed
  • Demonstrate solid judgment under pressure, especially when operating independently
  • Take ownership from detection to mitigation to resolution of issues

Benefits

  • Full internal training to understand architecture, operational flows, and incident procedures
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service