Senior Site Reliability Engineer (US Federal)

WorkdayReston, VA
1dHybrid

About The Position

This role will support one or more direct or indirect contracts with the U.S. Federal Government which, due to federal government security requirements, mandates that all Workday personnel working on the contracts be United States citizens (naturalized or native). We are looking for a highly motivated Site Reliability Engineer to join our growing Infrastructure and Platform Engineering team. You will play a critical role in operating, monitoring, automating, maintaining, and providing metrics and observability for our Kubernetes-based platform and enabling our engineering organization to deliver capabilities and incremental products quickly and reliably. You and the team will have the opportunity to work with cutting-edge technologies, solve complicated problems, and contribute to the foundation of our infrastructure. This role requires a strong technical background, a passion for automation, and a collaborative mindset. You have a growth mindset and will be part of a team promoting a diverse and inclusive environment where you and your workmates are happy, energized and engaged, and who are excited to come to work every day.

Requirements

  • A minimum of 5 years of hands-on experience working with large scale cloud infrastructure, automation, and overall DevOps methodologies
  • Bachelor's degree in a computer related field or equivalent work experience
  • This role may require a security clearance at the TS/SCI w/CI Poly level. Applicants must have the ability to obtain and maintain a U.S. government issued security clearance.

Nice To Haves

  • Infrastructure as code: Proficiency in infrastructure automation tools like Terraform.
  • CI/CD: Experience with building, maintaining, and consuming CI/CD pipelines and tools like Argo CD.
  • Problem-solving: Strong analytical and problem-solving skills.
  • Communication: Excellent communication and collaboration skills.
  • Strong understanding of Kubernetes
  • Amazon Web Services proficiency working in a production environment
  • Proficiency in at least one programming language such as C#, Python, Ruby, Rust, or Go programming language proficiency
  • Experience with security auditing and compliance frameworks.
  • Experience working in air gapped cloud regions
  • An active TS/SCI w/CI Poly is preferred.

Responsibilities

  • Ensuring the Workday Kubernetes based platform is maintained, healthy, and ensures high availability for our customers through, infrastructure automation, CI/CD pipelines, reporting, incident handling and response, and observability tools.
  • Maintain the overall platform: maintain core platform components, ensuring high availability, scalability, and security.
  • Automate and optimize: Automate infrastructure provisioning, configuration management, and application deployments using tools like Terraform and Argo CD.
  • Troubleshooting and support: Provide support and solve for platform-related issues, working closely with development teams to resolve problems.
  • Security and compliance: Implement and maintain security standard methodologies for the platform, ensuring compliance with industry standards.
  • Documentation and knowledge sharing: Build and maintain comprehensive documentation for platform components and processes. Actively participate in knowledge sharing within the team.
  • Collaborate effectively with other engineers and development teams across multiple locations and time zones.
  • Stay up-to-date with the latest technologies and trends in the platform engineering space.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service