Platform Engineer

P-1 AI
Remote

About The Position

We are looking for a Platform Engineer to own how Archie is deployed, operated, and scaled across complex customer environments. A core part of this role is building and evolving a BYOC (Bring Your Own Cloud) deployment model, where Archie runs directly inside customer infrastructure. This is a foundational role at the intersection of infrastructure, product, and AI, where you will define the systems that allow our AI engineer to run reliably in real-world settings. You will be our first dedicated platform hire, responsible for building the substrate that every deployment depends on. As we scale, you’ll have a key role in shaping the entire platform engineering function. This role can be either remote (based in the US or Canada and with existing work authorization) or based in our San Mateo Bay Area office. If you are remote, you should plan to spend one week per quarter co-working with the rest of the company in our San Mateo office, with the occasional team travel workshop in between. We will support relocation for candidates interested in moving to the Bay Area.

Requirements

  • Have built and operated Kubernetes-based systems in production.
  • Have designed infrastructure using IaC tools like Terraform, Pulumi, or similar.
  • Have implemented CI/CD or GitOps workflows that deploy systems across environments.
  • You are comfortable working across cloud providers (AWS, GCP, Azure) and understand their core primitives.
  • You have worked on systems that require strong security and networking considerations (e.g., VPCs, identity, encryption).
  • You like owning problems end-to-end and are comfortable operating in ambiguous, 0-1 environments.

Responsibilities

  • Design the BYOC deployment model for Archie across customer environments, including how systems are packaged, installed, and updated, and how they interact with P-1’s central systems.
  • Build and own Kubernetes-based infrastructure that runs reliably across multiple clouds and customer setups.
  • Create deployment tooling using Helm, GitOps, or similar approaches to make installation and operations repeatable.
  • Implement secure and reliable update mechanisms that work even in restricted or locked-down network environments.
  • Work closely with product, research, and forward-deployed engineers to translate requirements into platform capabilities.
  • Debug and improve production deployments, identifying bottlenecks in performance, reliability, or operability.
  • Establish observability, monitoring, and debugging patterns that give us visibility into distributed systems we don’t fully control.

Benefits

  • healthcare
  • dental
  • vision insurance
  • 401k with employer matching
  • unlimited PTO
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service