About The Position

Grafana Labs is a remote-first, open-source company with over 20 million users of its visualization tool, Grafana, globally. They help over 3,000 companies manage observability strategies with the Grafana LGTM Stack, offering scalable metrics, logs, and traces. The company is rapidly scaling, maintaining an open-source legacy, global collaborative culture, and a passion for meaningful work, fostering an innovation-driven environment with transparency, autonomy, and trust. This is a remote opportunity for applicants in USA time zones only. The Internal Engineering Platform, delivered by the Platform department, provides application engineers with tools, systems, and Kubernetes clusters for building, deploying, and running workloads. Platform roles focus on performance and reliability, taking projects from conception to production. The department is organized into squads focusing on Cloud Infrastructure, Networking and Security; Engineering Productivity; Capacity management, Client Administrative Tooling (CAT); and US Federal compliance. Due to deploying production services, on-call rotations are part of the role to ensure system health and understand product usage. The Platform Productivity squad is responsible for helping internal engineers release software onto infrastructure securely and measurably. This includes automating release processes (anywhere from CI/CD to bootstrapping) and guiding internal engineering teams with 'golden path' techniques, while also supporting edge cases and maximizing tool utilization. Ultimately, the team serves as the Platform Team for those building observability tools like Grafana, Mimir, Loki, and Tempo.

Requirements

  • Comfortable working in a remote-first company; communication is key
  • Eager to learn and grow
  • Approach development holistically, owning the full life cycle of code
  • Flexible software engineer, able to respond to incidents, integrate existing systems, or design and implement new systems
  • Experience with operating your code
  • Experience with Kubernetes and Docker
  • Experience with Infrastructure as Code (e.g., grafana/tanka)
  • Ability to contribute to a discussion but then commit to the team decision

Nice To Haves

  • Engineering/software development experience within a Platform group delivering services to internal engineering teams
  • Experience working in a cloud environment
  • Infrastructure as Code with Terraform/Crossplane
  • Familiarity with Kubernetes administration
  • Experience with Tanka
  • Experience/Interest in implementing, integrating, and maintaining observability systems and processes

Responsibilities

  • Maintain, improve and extend existing systems within the squad's roadmap
  • Be involved in choosing future focus areas and gracefully sunsetting systems which are no longer needed
  • Help the team to design, compare, and choose appropriate solutions for various tasks
  • Development and maintenance of our Internal Engineering Platform (IEP)
  • CI/CD platform management and development
  • Build, release and deployment automation
  • Application configuration management tooling
  • “Up to date” software automation
  • Artefact management
  • Working with diverse internal teams, from application development to security, to support implementation of their requirements
  • Being part of an on-call rotation to support Platform tooling
  • Working with engineers, as well as with the management structures that are there to support you and enable you and your team to do your very best
  • Working collaboratively, friendly, kind, and respectful in a remote-first company
  • Owning the full life cycle of our code; from writing design docs, looking at developer feedback, testing and deployment, all the way through to decommissioning
  • Responding to incidents, integrating existing systems, or designing and implementing our own systems

Benefits

  • Equity (Restricted Stock Units - RSUs)
  • Bonus (if applicable)
  • 100% Remote, Global Culture
  • Scaling Organization
  • Transparent Communication
  • Innovation-Driven environment
  • Open Source Roots
  • Empowered Teams
  • Career Growth Pathways
  • Approachable Leadership
  • Passionate People
  • In-Person onboarding
  • Global annual leave policy of 30 days per annum
  • 3 days of annual leave entitlement reserved for Grafana Shutdown Days
  • Modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget
  • Access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service