Staff Software Engineer - Platform, SysEng | Canada | Remote

Grafana Labs
CA$186,368 - CA$223,642Remote

About The Position

Grafana Labs is seeking a Staff Software Engineer for their Platform SysEng team. This role is crucial for scaling Grafana Cloud, a high-availability, low-latency observability platform that processes millions of metrics, log lines, and traces per second. The Internal Engineering Platform (IEP) team provides engineers with the tools, systems, and Kubernetes clusters needed to build, deploy, and run their workloads. This position requires a passion for performance and reliability, with the ability to take projects from conception to production. The role involves being part of a squad focused on cloud infrastructure, capacity management, security, engineering productivity, monitoring, sustainability, and US Federal compliance. As production services are deployed, on-call rotations are part of the role to ensure system health and provide an opportunity to understand the product line-up and user experience. The Platform SysEng squad is focused on the maturity and scalability of the platform, with a current goal of reducing new region build timelines to meet customer demands. This role is part of a Platform Engineering group that manages infrastructure for teams building key tools like Grafana, Mimir, Loki, Tempo, and Pyroscope.

Requirements

  • Proven delivery of large distributed systems.
  • Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impact.
  • Demonstrable experience in system design.
  • Deep understanding of tradeoffs around latency, consistency, availability, scaling and cost.
  • Hands-on cloud and platform experience.
  • Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC) and the operational practices that keep them healthy.
  • Reliability and performance ownership.
  • Excellent coding and design skills.
  • Comfort with AI-assisted development; curious and comfortable using AI-powered developer tools and ideally have practical experience folding them into a team’s workflow.
  • Influence without authority: Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environment.
  • Strong communicator: Clear written and verbal communication that works across engineers and non-technical stakeholders.
  • Experience with Go, or Python/C/C++/Rust or similar languages.

Nice To Haves

  • Worked in or on open source, or other community-based projects previously.
  • Familiarity with Kubernetes scheduling and projects like Karpenter.
  • Terraform and/or Crossplane experience.
  • Experience with Tanka and/or Jsonnet.

Responsibilities

  • Provide application engineers with the tools, systems, and Kubernetes clusters they need to build, deploy, and run their workloads.
  • Focus on the maturity and scalability of the platform.
  • Reduce new region build timelines to meet customer demands.
  • Manage infrastructure for teams building key tools like Grafana, Mimir, Loki, Tempo, and Pyroscope.
  • Ensure the health of the system through on-call rotations.
  • Define SLOs/SLIs, do capacity planning, tune performance, and drive reliability work end-to-end.
  • Write clear, maintainable, well-tested code.
  • Lead technical designs.
  • Use modern AI coding assistants as part of the daily workflow.
  • Align cross-functional stakeholders, set priorities, and drive outcomes in a remote-first environment.
  • Communicate clearly in writing and verbally across engineers and non-technical stakeholders.

Benefits

  • Equity
  • Bonus (if applicable)
  • Restricted Stock Units (RSUs)
  • 30 days of global annual leave per annum
  • 3 days of Grafana Shutdown Days
  • Company-funded usage budget for AI coding assistants
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service