Peraton is seeking a Cloud Reliability & Support Engineer in our Chantilly, VA office in support of our Department of Defense (DoD) customer as part of a highly talented, highly motivated and high-performing team. As the program’s expert in Level 3 Anomaly Resolution and operational excellence, your deep expertise in RHEL and RHOSP internals is used to conduct deep, in-project troubleshooting, ensuring tenant applications fully utilize the cloud’s resiliency features. Your focus is on stability by identifying root causes of system anomalies within the tenant's provisioned environment. Join us and be part of the next generation of innovators as we blaze a trail forward for our profession and company. What you'll do: Anomaly Resolution & Deep Troubleshooting Serve as the primary technical resource for complex, escalated incidents that are contained within the tenant's RHOSP project/resources. RHEL/OS Deep Dive: Expertly troubleshoot issues on tenant RHEL instances, including kernel panics, package conflicts, file system errors, and performance degradation (CPU, memory, I/O). RHOSP Resource Triage: Diagnose issues related to the tenant's consumption of OpenStack services (e.g., Nova instance failures, Neutron port issues, Cinder volume attachment problems). Utilize monitoring tools to perform deep-dive analysis and isolate the root cause of service disruptions within the OpenStack data plane. Root Cause Analysis (RCA): Own the technical execution and documentation of RCAs, focusing on issues rooted in RHEL/RHOSP misconfiguration or resource limitations. Maintain partnership with Red Hat vendor to stay up to date with the latest advancements in Red Hat products and industry best practices to maintain effective and innovative infrastructures
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
Associate degree
Number of Employees
5,001-10,000 employees