Software Engineer, Platform Operations

Planet

53d•Remote

About The Position

A Software Engineer in Platform Operations is responsible for helping design, build, and operate the core infrastructure that enables Planet's engineering teams to reliably deploy and scale their services with minimal friction. This role is a key contributor to the evolution of our modern cloud-native platform, focusing on security, scalability, and reliability. This position will work as part of a team that partners with software engineers across the organization to refine and improve the deployment and operational experience. The ideal candidate is excited by the opportunity to enable software engineering at scale and provide internal users a “batteries included” infrastructure experience through our internal developer platform. This is a full-time, remote position based in the United States. If located near an office, you are expected to work from that office 3 days per week.

Requirements

4+ years of experience in a Platform Engineering, System Administration, DevOps, or Site Reliability Engineering (SRE) role.
Deep understanding of Kubernetes, underlying compute systems, and Linux
Working knowledge of public clouds, particularly Google Cloud Platform (GCP) or Amazon Web Services (AWS).
Experience with CI/CD tools (e.g. GitLab, ArgoCD), Configuration Management (e.g. Terraform, Crossplane) and GitOps principles.
Ability to use an operational mindset and troubleshooting prowess for complex production environments.
Experience building services in languages such as Go and Python using tools like Git, Docker, and CI/CD workflows.
Experience building services that leverage cloud-based infrastructure and tooling such as AWS or GCP.
Ability to collaborate and clearly communicate designs and decisions verbally and in writing.

Nice To Haves

Experience in the operational management and development of core platform systems or open-source infrastructure projects.
Experience with maintaining highly available or operationally resilient infrastructure at very large scales or across multiple clouds.
Practical experience with networking and network architectures as it relates to platform infrastructure.
Experience with Observability tools and best practices.

Responsibilities

Design and implement core Infrastructure-as-Code (IaC) solutions to ensure the secure and scalable operation of Planet's services.
Actively work on major platform modernization initiatives, including the full migration from legacy tooling to new solutions.
Manage cloud-based infrastructure services, notably our fleet of Kubernetes clusters, and associated tooling to meet internal needs and support customer-facing service level agreements.
Enhance and maintain observability for key platform services, leveraging Grafana and other tools to establish Service Level Objectives (SLOs) and improve operational readiness.
Implement improvements and features for core systems owned by the team, such as GKE clusters, public API gateway, and other managed infrastructure solutions.
Collaborate with software engineering teams to refine the developer experience (DevEx) of our managed infrastructure.