Site Reliability Engineer

Accenture Federal ServicesArlington, VA

About The Position

At Accenture Federal Services, our purpose is to make the US federal government stronger and safer, and to improve the lives of citizens through technology and ingenuity. We are a community of over 13,000 professionals dedicated to serving clients across defense, national security, public safety, civilian, and military health organizations. As a technology company within the global Accenture network, we are recognized as a Glassdoor Top 100 Best Place to Work. We foster a collaborative and supportive environment where employees are empowered to grow, learn, and thrive through hands-on experience, certifications, and industry training. Join us to drive meaningful, lasting change that advances missions and propels the government forward.

Requirements

  • Bachelor’s degree (or 4 years of additional equivalent experience)
  • Minimum 8 years of experience managing reliability, uptime, and automating operations for large IT systems
  • Must meet DoD 8140 requirements
  • Active TS/SCI clearance

Responsibilities

  • Ensure the reliability, performance, and scalability of the Client System.
  • Define and track Key Performance Indicators and Service Level Objectives.
  • Identify and resolve performance bottlenecks.
  • Perform root cause analysis on incidents to implement preventative measures and enhance system efficiency.
  • Design and implement monitoring and alerting systems to provide visibility into system health and performance.
  • Develop and maintain runbooks and playbooks for operational procedures.
  • Automate routine operational tasks to improve efficiency and reduce human error.
  • Conduct capacity planning to ensure the system can handle expected loads.
  • Implement load testing to validate system performance under stress.
  • Establish disaster recovery procedures to minimize downtime.
  • Participate in on-call rotations to respond to system incidents.
  • Collaborate with development teams to improve application reliability and performance.
  • Implement chaos engineering practices to identify weaknesses before they impact users.

Benefits

  • Health insurance
  • Dental insurance
  • Vision insurance
  • Life insurance
  • Disability insurance
  • 401k
  • Professional development
  • Learning and development program
  • Continued education
  • Tuition reimbursement
  • Employee discount programs
  • Wellness programs
  • Flexible scheduling
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service