Principal Engineer - Platform Architecture & Core Reliability

Quizlet•San Francisco, CA

74d•$260,000 - $320,000

About The Position

We're hiring a Principal Engineer to lead critical architectural decisions that establish industry-leading standards for reliability and operational excellence. This is a high-leverage, hands-on role focused on optimizing performance, driving engineering velocity, and leading systemic architectural change. The role reports to the Senior Director of Technical Infrastructure. We're happy to share that this is an onsite position in our San Francisco office. To help foster team collaboration, we require that employees be in the office a minimum of three days per week: Monday, Wednesday, and Thursday and as needed by your manager or the company. We believe that this working environment facilitates increased work efficiency, team partnership, and supports growth as an employee and organization.

Requirements

Deep technical mastery and hands-on experience across three or more of the following high-leverage domains with 10+ years of experience in software development, site reliability engineering, or platform engineering.
A proven track record of driving significant architectural outcomes as a Principal or Staff Engineer in a high-scale platform or infrastructure role.
A history of scaling consumer-facing systems that reliably handle tens of thousands of requests per second (RPS) and successfully achieving high-availability targets across multi-region cloud environments.
Deep expertise in architecting and optimizing complex data backbones involving transactional, globally consistent systems and analytical systems.
Strong operational knowledge of Kubernetes (GKE) orchestration layered with a Service Mesh (Istio).
Proven ability to design and implement automated CI/CD pipelines leveraging tools like GitHub Actions, CircleCI, and ArgoCD.
Mastery in implementing comprehensive monitoring using Datadog for defining SLOs and performing deep-dive investigations.
Extensive experience in designing and optimizing large-scale cloud-native architecture on GCP (or equivalent cloud providers).

Responsibilities

Lead the strategy and implementation necessary to achieve and maintain our 99.95% availability target.
Define the architectural approach for scaling our core data systems, optimizing performance across Cloud Spanner, PlanetScale MySQL, and BigQuery.
Drive performance and efficiency improvements across our managed compute environment, specifically optimizing Kubernetes (GKE) clusters and managing the performance and operational complexity of Istio.
Architect high-leverage internal platforms, designing the pipelines across tools like GitHub Actions, CircleCI, and ArgoCD.
Drive reliability change across the engineering organization by leveraging deep-dive analysis of incidents (Jeli) and proactive monitoring (Datadog).
Act as a technical owner for the cost-per-request metric, identifying and implementing architectural efficiencies.

Benefits

Total compensation for this role is market competitive, including a starting base salary of $260,000 - $320,000, depending on location and experience, as well as company stock options.
20 vacation days that we expect you to take!
Competitive health, dental, and vision insurance (100% employee and 75% dependent PPO, Dental, VSP Choice).
Employer-sponsored 401k plan with company match.
Access to LinkedIn Learning and other resources to support professional growth.
Paid Family Leave, FSA, HSA, Commuter benefits, and Wellness benefits.
40 hours of annual paid time off to participate in volunteer programs of choice.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Publishing Industries

Principal Engineer - Platform Architecture & Core Reliability

About The Position

Requirements

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company