Senior Site Reliability Engineer (SRE) Senior Manager

Accenture Federal ServicesWashington, DC
47d

About The Position

At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people. Our 13,000+ people are united in a shared purpose to pursue the limitless potential of technology and ingenuity for clients across defense, national security, public safety, civilian, and military health organizations. Join Accenture Federal Services, a technology company and part of global Accenture, to do work that matters in a collaborative and caring community, where you feel like you belong and are empowered to grow, learn and thrive through hands-on experience, certifications, industry training and more. Join us to drive positive, lasting change that moves missions and the government forward! We are seeking a Senior Site Reliability Engineer (SRE) with deep expertise in building and maintaining reliable, scalable systems and a passion for optimizing the performance, reliability, and efficiency of technical infrastructure. The ideal candidate will have a strong background in site reliability engineering principles, extensive experience with automation, and a proven ability to collaborate across teams to ensure seamless service delivery.

Requirements

  • Proven experience in site reliability engineering or a similar role, with a focus on application and infrastructure scalability, reliability, and performance.
  • Strong knowledge of ITSM principles and incident management processes.
  • Expertise in automation tools, scripting, and infrastructure-as-code (IaC) technologies.
  • Proficiency with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Splunk).
  • Experience with cloud platforms (e.g., AWS, Azure, GCP) and container technologies (e.g., Docker, Kubernetes).
  • Strong analytical and problem-solving skills, with the ability to troubleshoot complex systems.
  • Excellent communication and collaboration abilities, with a focus on cross-team partnerships.
  • A passion for continuous learning, innovation, and driving improvements in reliability and efficiency.
  • US Citizenship Required
  • The ability to obtain and maintain a government security clearance

Nice To Haves

  • Advanced Degree
  • 15+ years of industry experience
  • Motivated and proactive, with a desire to understand and address complex areas
  • Curiosity for learning about new technology, industry best practices, and areas of risk, analyzing and turning new insights into concrete action
  • Commitment to delivering tangible outcomes for customers and stakeholders
  • Strong written and verbal communication/interpersonal skills to effectively collaborate with cross-functional teams and stakeholders.
  • Excellent people management and relationship development skills
  • In-depth knowledge of Accenture delivery methodologies and practices

Responsibilities

  • Design, build, and maintain reliable, scalable, and high-performance infrastructure and services to support business needs.
  • Implement and advocate for SRE best practices, including automation, CI/CD pipelines, monitoring, and incident management.
  • Collaborate with cross-functional teams to develop systems that meet high availability, performance, and reliability standards.
  • Drive incident management processes, including root cause analysis, mitigation strategies, and long-term preventive measures.
  • Establish, monitor, and refine service level objectives (SLOs), service level agreements (SLAs), and key performance indicators (KPIs) to ensure systems adhere to reliability and performance targets.
  • Automate repetitive tasks to improve operational efficiency and reduce manual intervention.
  • Build and maintain robust monitoring, logging, and alerting systems to ensure visibility into system performance and reliability.
  • Provide technical mentorship and guidance to team members, fostering a culture of knowledge sharing and continuous improvement.
  • Act as a technical leader by driving solutions to complex challenges, ensuring alignment with organizational goals.
  • Prepare and deliver performance and reliability reports to stakeholders, offering insights and recommendations for improvements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service