Sr. Staff Cloud Platform Engineer - Security

UKGAtlanta, GA
$129,500 - $186,100

About The Position

At UKG, our purpose is people. We are seeking a highly technical Cloud Resilience Analyst to join our Resilience team within the Cyber Security organization. This is not a traditional Business Continuity Planning (BCP) or Disaster Recovery (DR) compliance role. We are looking for a true cloud practitioner—someone who has historically been hands-on in the trenches building highly available architectures and is now ready to step into a strategic advisory role. In this position, you will act as a primary consultant to multiple product engineering teams and enterprise groups across UKG. You will guide them in designing, implementing, and validating fast-failover, highly redundant solutions for our SaaS applications. If you have a passion for eliminating single points of failure and designing self-healing infrastructure on modern cloud platforms, this role is for you.

Requirements

  • Deep, practical technical knowledge of Google Cloud Platform (GCP) core services, specifically GKE, Compute Engine, and CloudSQL.
  • Familiarity with AWS and Azure is highly desirable.
  • Proven past experience as a hands-on engineer who has deployed complex infrastructure. You should understand the implementation details well enough to effectively guide engineering teams.
  • Demonstrated success in architecting active-active or active-passive fast failover mechanisms for high-volume, data-intensive SaaS applications.
  • Strong understanding of database clustering, replication, and migration strategies (especially migrating legacy RDBMS like MS SQL Server to cloud-native solutions like CloudSQL).
  • Excellent communication and consulting skills, with the ability to influence technical teams, explain complex architectural concepts, and foster a culture of resilience without having direct reporting authority over the engineering teams.

Nice To Haves

  • Practical design experience managing high-availability network topologies, including load balancing, DNS & name resolution, firewalls/gateways, identity/authentication systems, and centralized logging/SIEM.
  • Deep understanding of user session replication, session state persistence, and failover routing strategies in high-traffic, multi-region application architectures.
  • Familiarity assessing or designing resilience across a comprehensive range of critical SaaS failure domains, such as API gateways, caching layers, messaging/queuing systems, and CI/CD pipelines.

Responsibilities

  • Act as the primary resilience advisor to multiple distributed product and enterprise teams, guiding them on best practices for building high availability (HA) and redundancy into their SaaS applications.
  • Design and recommend fast-failover solutions and highly available infrastructure primarily on Google Cloud Platform (GCP), while also providing oversight for workloads in Azure and AWS.
  • Leverage your strong background in Infrastructure as Code (IaC) to review, validate, and guide the implementation efforts of engineering teams.
  • Design redundancy strategies for workloads running on Google Kubernetes Engine (GKE) and virtual machines, ensuring self-healing deployments.
  • Partner closely with DevOps, SRE, and Product Engineering teams to champion resilience engineering principles, chaos testing, and failover validations across tier-0 mission-critical systems.

Benefits

  • flexibility that’s real
  • benefits you can count on
  • a team that succeeds together
  • performance-based bonus plan
  • restricted stock unit awards
  • health insurance
  • dental insurance
  • vision insurance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service