Principal Software Engineer, AI Cloud

Decision FoundrySeattle, WA
2d$232,000 - $319,000Remote

About The Position

We are seeking a Principal Software Engineer for our client who will define the technical vision and lead the design and implementation of AI Cloud’s distributed systems. As a key member of the AI Cloud leadership team, you will partner with principal engineers across the company to architect scalable, reliable, and secure infrastructure that supports millions of developers and thousands of enterprises.

Requirements

  • 10+ years of software engineering experience, including 3+ years in technical leadership roles (Staff or Principal level)
  • Proven experience designing and building highly scalable distributed systems in production environments
  • Deep understanding of cloud infrastructure (AWS, Azure, GCP, or OCI), including compute, networking, and storage primitives
  • Proficiency in Go, Rust, or Java
  • Expertise in Kubernetes, microservices, and service mesh architectures
  • Strong foundation in observability, CI/CD, and infrastructure-as-code (Terraform, Pulumi, or CloudFormation)
  • Experience operating high-availability (99.99%+) production systems
  • Exceptional communication skills and ability to influence across technical and business domains
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience

Nice To Haves

  • Experience designing multi-cloud or cross-cloud abstractions and orchestration layers
  • Knowledge of container lifecycle management, networking, and policy enforcement
  • Prior experience in developer infrastructure, PaaS, or hyperscale SaaS environments
  • Background contributing to open source or developer-focused platforms is a plus.

Responsibilities

  • Define and drive the long-term technical strategy for AI Cloud’s control and data plane services.
  • Architect highly available, multi-region systems capable of operating seamlessly across multiple cloud providers.
  • Design APIs and service abstractions that integrate Desktop, Hub, and enterprise cloud services.
  • Establish standards for reliability, scalability, and observability across the AI Cloud platform.
  • Lead cross-functional technical discussions and influence architectural decisions company-wide.
  • Design and implement distributed systems for workload orchestration, service discovery, and lifecycle management.
  • Build and operate control plane components that manage multi-tenant workloads and cloud networking.
  • Develop infrastructure that delivers predictable performance, intelligent scaling, and automated failover.
  • Ensure security, data integrity, and compliance across global infrastructure footprint.
  • Partner with platform and product teams to deliver developer-friendly APIs and cloud experiences.
  • Align technical direction with business objectives for cloud growth and developer platform unification.
  • Evaluate emerging technologies (e.g., service meshes, container orchestration, edge computing) and guide adoption.
  • Drive initiatives that reduce latency, optimize cost, and improve cross-cloud performance.
  • Define metrics and SLAs for AI Cloud’s reliability and scalability.
  • Mentor senior, staff and principal engineers, fostering technical excellence and growth across teams.
  • Lead design reviews and guide critical production system decisions.
  • Drive a culture of operational excellence, ownership, and innovation.
  • Collaborate with engineering and product leadership to align priorities and resource planning.
  • Take part in on-call rotation for your team; respond to incidents, debug production issues, and drive continuous improvement of system reliability.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service