The Site Reliability Engineer (i.e., SRE) role is responsible for the optimization and reliability of core technical platforms and platform services, and exerting significant technical leadership in the continuous improvement of service reliability to platform stakeholders. The SRE will champion the overall health of OF core technical platforms, lead the response to operational incidents, determine root causes, propose and implement remediations that ensure overall platform viability. OF IT platforms and infrastructure exist over three locations (i.e., on-premise), including, Office Headquarters (Reston, VA), Primary Data Center Co-Location (Sterling, VA), and Disaster Recovery Data Center Co-Location (Chicago, IL), as well as a limited set of infrastructure services provided by Microsoft Azure (i.e., Azure). The core technical platform is Red Hat OpenShift, with a variety of platform services to include, but not limited to, Red Hat AMQ, HashiCorp Vault, and Keycloak, that are consumed by various platform stakeholders. This role will span from the OpenShift platform to services provided by Azure. Were proud of the way our teammates have a positive impact on everything we do. Our employees are committed to and exemplify our Core Values: Integrity through accountability, consistency, transparency and trustAgility through adaptability, continuous improvement, expertise, and flexibilityPartnership through collaboration, communication, leadership, and teamworkInclusivity through diversity, relationships, respect, and support
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
101-250 employees