Senior Platform Systems Engineer

BettermodeToronto, ON
Hybrid

About The Position

At Bettermode, we are redefining how businesses streamline customer experiences and foster strong relationships. Our platform empowers businesses to seamlessly craft powerful web apps with engagement tools in its core tailored to their unique needs. Backed by Silicon Valley investors and trusted by brands like Lenovo, Mercedes, and Xano, we’re proud to connect millions of end-users daily. Join us as we continue building tools that redefine customer engagement! This is not a generic DevOps role, not a narrow tool-operator role, and not a vendor-certified specialist role. You will help shape foundational parts of Bettermode's platform across Kubernetes runtime architecture, Terraform-governed AWS and Cloudflare infrastructure, service-to-service networking, data-plane efficiency, OLAP analytics systems, databases such as Aurora PostgreSQL and MongoDB Atlas, cost visibility, security/compliance governance, and deployment architecture. The role is intentionally broad across solutions architecture and systems programming: tune where appropriate, but build or redesign when necessary, with a strong emphasis on secure, recoverable, and well-governed platform foundations. Our operating model includes a production on-call program: engineers participate in an every-other-week rotation for P0 incident response, post-incident learning, and production ownership.

Requirements

  • Deep production experience with Kubernetes/EKS, Terraform/OpenTofu, AWS, and Cloudflare, including secure deployment patterns, environment promotion, drift control, and operational ownership.
  • Strong software engineering fundamentals building backend, infrastructure, or distributed systems in production, with systems instincts for concurrency, performance, failure modes, and operational correctness.
  • Professional experience with Go or Rust for production platform components such as Kubernetes controllers, Terraform providers, CLIs, proxies, telemetry collectors, reconcilers, or internal developer tooling; TypeScript is valuable for higher-level platform definition and GitOps tooling where appropriate.
  • Solid understanding of Linux, TCP/IP, HTTP, HTTP/2, gRPC, connection behaviour under load, and service-to-service networking.
  • Strong understanding of security and compliance-oriented platform engineering, including SOC 2 evidence, OWASP-aligned practices, GDPR-aware data handling, IAM boundaries, encryption, secrets management, and audit trails.
  • Practical experience designing or operating DR capabilities, including backups, restore testing, RTO/RPO trade-offs, failover procedures, degraded-mode operation, and incident response playbooks.
  • Practical familiarity with databases or OLAP/data infrastructure such as Aurora PostgreSQL, MongoDB Atlas, Pinot, ClickHouse, or similar systems, plus the ability to reason from first principles instead of relying only on vendor defaults.

Nice To Haves

  • Have built Kubernetes controllers, operators, reconcilers, or long-running platform agents in production.
  • Have experience with service meshes, proxies, transport-aware systems, traffic steering, or network observability for cost-attribution.
  • Have worked on cloud cost attribution, workload-level infrastructure observability, eBPF, VPC Flow Logs, or similar telemetry systems
  • Have MLOps experience with KServe, Kubeflow, MLflow, model-serving infrastructure, GPU workloads, or other AI/ML platform systems.
  • Have contributed to brownfield infrastructure migration, Terraform/OpenTofu import workflows, drift detection, policy-as-code, or infrastructure governance.
  • Have experience replacing brittle YAML/Helm-heavy abstractions with typed platform tooling, GitOps generators, CDK-style infrastructure definitions, or internal developer platforms.

Responsibilities

  • Diagnose and remediate foundational platform problems across Kubernetes/EKS, Terraform-managed AWS and Cloudflare infrastructure, networking, observability, OLAP/data systems, security controls, and deployment architecture.
  • Own Kubernetes platform patterns and Terraform/OpenTofu workflows that make environments reproducible, reviewable, secure, and recoverable, including promotion, drift control, and policy-aware infrastructure changes.
  • Design AZ-aware and topology-aware improvements, starting with Aurora PostgreSQL routing/scalability and extending to other data-plane systems where traffic locality, availability, and cost matter.
  • Build cost and workload observability that attributes AWS infrastructure spend, network transfer, CPU/memory usage, and cross-AZ patterns to services, workloads, and teams.
  • Build production-grade platform components in Go, Rust, or TypeScript where appropriate, including Kubernetes controllers, Terraform plugins, telemetry collectors, bespoke proxies, and CLIs. Select the language based on SDK maturity, operational correctness, maintainability, and ecosystem fit.
  • Implement platform security and compliance controls aligned with SOC 2, OWASP, GDPR, IAM least privilege, secrets handling, encryption, network segmentation, auditability, and data protection.
  • Support OLAP analytics infrastructure and the migration from Pinot to ClickHouse, with attention to ingestion topology, query performance, data correctness, cost, and operational safety.
  • Design safe rollout, resilience, and DR patterns, including canaries, bypass modes, fast rollback, degraded-mode operation, backup/restore workflows, failover procedures, RTO/RPO trade-offs, and incident playbooks.

Benefits

  • Location-based, competitive compensation that reflects your expertise and impact, with annual reviews
  • Comprehensive Canadian health benefits—dental and vision included
  • Unlimited paid vacation days
  • Paid parental leave
  • Bereavement leave
  • All the equipment you need provided, or you can bring your own device and access our Device Upgrade Policy—an interest-free hardware stipend repayable via payroll deductions
  • Monthly Tech & Appreciation Stipend
  • Complimentary snacks, coffee, video games, and board games at the downtown Toronto office
  • Globally diverse and collaborative team
  • Resources needed to succeed
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service