Principal System Engineer - Cloud

O'Reilly Auto Parts
82d

About The Position

A Principal Systems Engineer will utilize expert-level infrastructure, cloud architecture, and systems design knowledge to lead the evolution of cloud-native platforms and enterprise infrastructure. This position demands a high degree of technical leadership, operational foresight, and systems thinking. The ideal candidate has deep hands-on experience with Google Cloud Platform (GCP), automation frameworks, and highly available distributed systems.

Requirements

  • 10+ years of infrastructure or systems engineering experience, with 3+ years in a Principal or Lead role.
  • Proven hands-on expertise with GCP core services including VPC, GKE, Pub/Sub, Cloud Run, Cloud Functions, IAM, Cloud SQL, and BigQuery.
  • Deep understanding of Terraform, CI/CD pipelines (Cloud Build, GitHub Actions, or Jenkins), and container orchestration (Kubernetes).
  • Experience designing and supporting resilient, high-throughput distributed systems.
  • Strong background in Linux systems engineering, networking, and scripting (Python, Bash, or Go).
  • Familiarity with DevOps and SRE practices including monitoring, SLOs/SLIs, chaos testing, and auto-remediation.
  • Effective communicator with the ability to influence cross-functional teams and senior leadership.

Nice To Haves

  • Experience with multi-region, hybrid cloud, or enterprise-wide cloud migration strategies.
  • Relevant GCP certifications (e.g., Professional Cloud Architect, Professional DevOps Engineer) strongly preferred.

Responsibilities

  • Maintain expert-level knowledge of emerging GCP services, infrastructure design patterns, automation tooling, and site reliability engineering practices.
  • Provide architectural guidance and technical mentorship to engineers across infrastructure, DevOps, and platform teams.
  • Collaborate with Cloud Architects and Engineering Leadership to define standards for GCP resource hierarchy, networking, security, and cost optimization.
  • Lead the design, deployment, and automation of scalable, resilient infrastructure on GCP, using tools such as Terraform, Kubernetes (GKE), and Cloud Build.
  • Guide enterprise-wide initiatives focused on infrastructure modernization, cloud migration, and zero-downtime operations.
  • Implement best-in-class monitoring, logging, and observability practices using Stackdriver, Prometheus, and third-party integrations.
  • Conduct design reviews, evaluate new services, and ensure infrastructure solutions align with performance, security, and compliance requirements.
  • Lead incident response and root cause analysis for infrastructure-impacting events, driving long-term stability improvements.
  • Define and enforce GCP IAM policies, network segmentation, org policies, and workload identity federation strategies.
  • Work closely with application and data engineering teams to optimize cloud deployments, container orchestration, and API gateways.
  • Own the infrastructure-as-code lifecycle, enabling consistent, version-controlled deployments across environments.
  • Contribute to technical roadmaps, budgeting forecasts, and capacity planning processes.
  • Document architectural decisions, cloud resource inventories, and runbooks for critical platform services.
  • Champion cloud-native best practices, DevSecOps principles, and efficient operational processes.

Benefits

  • Competitive Wages & Paid Time Off
  • Stock Purchase Plan & 401k with Employer Contributions Starting Day One
  • Medical, Dental, & Vision Insurance with Optional Flexible Spending Account (FSA)
  • Team Member Health/Wellbeing Programs
  • Tuition Educational Assistance Programs
  • Opportunities for Career Growth

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service