Google Cloud Platform - SRE Manager

Huntington National Bank-posted 2 days ago

Full-time • Mid Level

Hybrid • Columbus, OH

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

The Google Cloud Platform (GCP) Site Reliability Engineer (SRE) Manager is responsible for supporting the GCP framework and consumers of the platform. The position reports to the Chief Development Office’s (CDO) Cloud Infrastructure Acceleration team. The SRE manager will lead a team of Onshore and Offshore SRE’s to develop Infrastructure as Code (IaC) and pipelines to provide platform, infrastructure, observability, and security capabilities. The qualified candidates will collaborate with the CDO, Application, Incident, Security, and Change Management teams to manage the ITIL process, reduce toil, enhance reliability, and drive innovation. Candidate will manage a team of developers whose goals are reliability, compliance, automation, enablement, release when ready and to build a culture of support, continuous improvement, and learning.

Manage GCP’s SRE team, discipline, maintain service levels, manage cost, and enhance operations.
Manage Stack Overflow channel, GCP releases and Disaster Recovery exercises.
Manage Platform RBAC, Firewall and User Access certifications.
Support GCPs’ 3rd party system integrations.
Develop SRE strategies, best practices, and knowledge base.
Develop monitoring and alerting capabilities to increase observability, availability and reduce toil.
Participate in the DevSecOps model to build, assess, and implement SRE cloud solutions via IaC.
Collaborate with Incident, Cybersecurity, Application and SRE teams to troubleshoot issues, restore functionality, perform root cause analysis, and deliver enhancements.
Provide 24x7 GCP support and coordinate on-call rotations.
Conduct periodic blameless incident retrospective and focus on continuous improvement.
Conduct training sessions and simulated game days.
Experience with scripting and programming languages and concepts
Demonstrate knowledge of GCP, CLI, services and integration.
Demonstrate knowledge of DevSecOps tool chains and processes.
Demonstrate knowledge of IaC software: Terraform, CLI, CDM, CFT, and ARM.
Demonstrate knowledge of Security as Code principles, policy, best practices, and tools.
Demonstrate knowledge of Credential, Certificate and Encryption best practices, rotation, and policies.
Experience using monitoring tools like Cloud Logging, Splunk, and Dynatrace to evaluate system health, develop dashboards, research issues, identify root causes and provide solution options.
Duties as assigned

Minimum of 5 years of SRE experience with GCP, AWS, and/or Azure
Minimum of 5 years of experience developing automated solutions using IaC - Terraform or OpenTofu. Additional experience is a plus: Python, PowerShell, Ansible, Chef, Ruby, and JSON.
Minium of 3 years managing onshore or offshore teams.
Bachelor's degree or equivalent work experience

Experience troubleshooting cloud-based technologies.
Cloud (GCP, AWS, Azure) and/or IaC certifications and/or work experience
Experience in Agile delivery, Azure DevOps Services, CI/CD Pipelines, Monitoring and Security tools.
Security tool integration experience: Prisma, Snyk, or GitLeak’s.
Experience with cloud security, IAM, Security Scans and custom policies.
Full stack engineering knowledge – application, network, infrastructure, and security
Understanding of containers and serverless computing concepts
Background in application, database, and infrastructure monitoring tools
Willingness to guild others and outstanding communication skills
Familiarity with financial industry

health insurance coverage
wellness program
life and disability insurance
retirement savings plan
paid leave programs
paid holidays
paid time off (PTO)

Track Jobs with Teal

Job Search Resources

•

Resume Builder

•

Resume Examples

•

Cover Letter Examples

Google Cloud Platform - SRE Manager

Job Search Resources

Tools

Career Hubs

Guides

Company