Principal Engineer II

Menlo Security

57d

About The Position

Menlo Security's mission is enabling the world to connect, communicate and collaborate securely without compromise. COVID-19 has made our mission all the more real. We support customers across various enterprises including Fortune 500 companies, 9/10 of the largest global banks and the Department of Defense. The world has fundamentally changed. We are growing from 400 employees into the next phase of our journey, and we need passionate talent filled with empathy and agility. The right candidate for the job is ethical, hyper-organized, fanatical about seeing things through to completion, service-oriented, and humble enough to take feedback and coaching yet confident enough to provide feedback and coaching. Menlo is well-funded for growth and our investors are second to none. They include Vista Equity Partners (“Vista”), General Catalyst, JPMC, American Express, HSBC, and Ericsson Ventures. About the Role Platform Infrastructure Engineering is responsible for building and operating Menlo Security's Infrastructure Platform. Together with the rest of our engineering teams, we enable our customers to connect to the Internet without compromise. Our environment provides services globally. We expect failure, build security in by design, create evolvable systems, and enable multi-tenancy across the infrastructure. Automation and thoughtful usage of Gemini and Claude AI tooling to accelerate our workflows is an absolute for us. We are committed to getting it done properly, the first time. As a Principal II Platform Infrastructure Engineer, you'll join a group of experienced engineers who are part of a globally distributed team responsible for building and managing the company's core infrastructure services and maintaining our constantly growing platform. The team operates a sophisticated cloud-native infrastructure built on Google Kubernetes Engine and VMs spanning multiple environments globally from development to production. Operating at the highest level of individual contribution, you will drive the technical vision for this environment. Crucially, you will draw on your expertise to guide the organization through complex architectural transformations, strategically decoupling legacy monolithic systems into scalable, highly resilient cloud-native microservices.

Requirements

Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience, coupled with 15+ years of progressive infrastructure engineering experience.
Extensive, proven experience successfully guiding engineering organizations through large-scale architectural transformations from legacy monoliths to microservice-based ecosystems.
Proficiency in common programming & scripting languages. We use a lot of python, bash and go.
Kubernetes expertise including cluster administration, RBAC, networking, workload management, and troubleshooting across production environments. Knowledge of Google Cloud Platform services including GKE, VPC networking, Cloud DNS, Artifact Registry, Secret Manager, IAM, Gemini Code Assist, and Workload Identity.
Proven experience with Terraform for infrastructure provisioning and management.
Understanding of network topologies, communication protocols (ie. TCP/IP, HTTP/S, UDP, TLS) and enterprise grade connectivity solutions.
Experience with GitOps methodologies and tools. Clear understanding of how to use LLM code assist tools to effectively build software.

Responsibilities

Define the long-term architectural roadmap and design, deploy, and maintain VM and Kubernetes infrastructure on GCP and AWS across dozens of clusters spanning development, staging, and production environments in multiple regions.
Lead the strategic modernization of our services, acting as the primary architectural guide for development teams navigating the complex transition from monolithic architectures to decoupled microservices.
Build and maintain Infrastructure as Code (IaC) using Terraform modules, managing resources through Spacelift or equivalent Terraform Automation and Collaboration Software (TACOS). Provision cloud infrastructure including networking, compute, storage, and security components primarily on GCP, with secondary AWS support. Implement and manage workflows with sophisticated multi-layer configuration management.
Partner with Engineering, Product, Compliance, and Security teams to design resilient, scalable systems. Consult on capacity planning, disaster recovery, and architectural decisions for cloud-native applications.
Build and maintain comprehensive observability solutions using Grafana Cloud, Prometheus/Mimir, and OTel collectors. Design Grafana dashboards, configure alerting rules, and ensure visibility across all platform components.
Manage certificate lifecycle, DNS automation, ingress controllers, and service mesh networking with Cilium.
Identify and eliminate toil through automation and usage of modern AI tools like Gemini and Claude. Write scripts, develop tools, and build CI/CD pipelines to improve operational efficiency and reduce manual work.
Participate in a 24x7 on-call rotation as part of a globally distributed team, responding to incidents and driving post-incident reviews.

Benefits

Base Salary is one part of our competitive total compensation and benefits package
salary range
stock-based compensation grants
equal opportunity employer

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume