Senior DevOps Engineer

IRENVancouver, BC
$135,000 - $150,000

About The Position

IREN is a leading AI Cloud Service Provider, delivering large-scale GPU clusters for AI training and inference. IREN’s vertically integrated platform is underpinned by its expansive portfolio of grid-connected land and data centers in renewable-rich regions across the U.S. and Canada. With 100% renewable energy, we build, own and operate our data centers and take pride in being at the forefront of sustainable solutions for the ever-evolving applications of high-performance compute. We believe that human progress is invaluable, but it should be done in the right way – responsibly, sustainably and having a positive impact on the communities we operate in. As a Senior DevOps Engineer, you will help build and operate the foundational systems that power our GPU-enabled multi-tenant environment. You will design and maintain the infrastructure, automation, and platform abstractions that enable product teams to deliver reliable, scalable, and high-performance services across our cloud and on-prem footprint.

Requirements

  • 5+ years of experience in Platform Engineering, DevOps, Cloud Infrastructure, or SRE roles.
  • Proficient in Python or Go for platform tooling, automation, and API integration.
  • Hands-on experience with Kubernetes(operations, networking, storage, RBAC, and multi-tenant considerations).
  • Strong background in AWS services including EC2, EKS/ECS, S3, IAM, VPC, RDS.
  • Expertise with Infrastructure-as-Code(Terraform or CloudFormation) and GitOps workflows (ArgoCD).
  • Familiarity with service mesh, ingress controllers, and CNI network stacks.
  • Strong understanding of observability: Prometheus (metrics/alerting), Grafana (dashboards), and logs.
  • Familiarity with event-driven systems (Kafka basics: topics, partitions, consumer groups).
  • Experience with secure configuration of cloud/K8s systems (RBAC, SSO integrations,NetworkPolicies).
  • Clear and structured communicator with cross-functional teams.
  • Strong problem-solving skills — able to debug complex distributed systems.
  • Pragmatic engineer who balances long-term quality with short-term delivery.
  • Collaborative team member comfortable working across Platform, Infrastructure, and Product domains.

Nice To Haves

  • Experience running on-prem Kubernetes or hybrid cloud infrastructure.
  • Familiarity with GPU workloads, node management, or multi-tenant compute isolation.
  • Hands-on experience with load testing, chaos engineering, or advanced performance tuning.
  • Exposure to internal developer platforms (IDP) or platform-as-a-product mindset.
  • Experience with Helm, Kustomize, Pulumi, or Temporal workflow engine.

Responsibilities

  • Design, develop, and operate core platform services that support compute, networking, and storage across cloud and on-prem environments.
  • Build and maintain Infrastructure-as-Code using tools like Terraform or CloudFormation, following GitOps best practices with ArgoCD.
  • Implement and operate continuous delivery pipelines and deployment automation for platform services.
  • Improve platform scalability, reliability, and performance across Kubernetes and cloud systems.
  • Collaborate with HPC, Networking, and Security teams to ensure resilient, multi-tenant platform architectures.
  • Operate and optimize Kubernetes clusters(EKS and on-prem K8s distributions).
  • Manage networking components such as CNI plugins, Ingress controllers, and Service Mesh(Envoy/Istio/NGINX).
  • Configure and maintain RBAC, Network Policies, and identity integration for secure, least-privilege access.
  • Work with cloud resources (AWS EC2, EKS, S3, RDS) to deploy, scale, and secure platform services.
  • Implement and maintain platform observability tooling using Prometheus, Grafana, Loki, and alerting pipelines.
  • Establish and maintain SLOs/SLAs, responding to incidents and continuously improving reliability.
  • Debug distributed systems issues across compute, network, storage, and application surfaces.
  • Contribute to on-call rotations and production readiness reviews.

Benefits

  • Medical, dental, and vision insurance coverage – 100% company paid for employees and dependents
  • Company-paid life and disability insurance
  • Voluntary life and critical illness coverage available
  • Employee Assistance Program and virtual health care platform
  • RRSP with company match
  • Voluntary TFSA
  • 3 weeks annually for vacation and paid holidays
  • Opportunities for advancement and internal mobility
  • Training and personal development opportunities
  • Company events and team-building activities
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service