About The Position

Ampcus Inc. is a certified global provider of a broad range of Technology and Business consulting services. We are in search of a highly motivated candidate to join our talented Team. Job Title: Site Reliability Engineer (SRE) Location(s): Fort Mill, SC Role Overview The role requires a strong Site Reliability Engineering (SRE) and DevOps professional with deep expertise in observability, automation, cloud platforms, and modern operational practices. The individual will provide oversight of production operations, drive reliability and scalability, and enable proactive, data-driven operations through automation, AIOps, and continuous improvement.

Requirements

  • Strong developer background with the ability to understand application-layer behavior and its interaction with infrastructure platforms.
  • End-to-end understanding of the software delivery lifecycle—from code management through deployment.
  • Quick learner with the ability to rapidly adopt new tools, scripting languages, or technologies as required.
  • Proactive, solution-oriented mindset with a strong focus on reliability, scalability, and operational excellence.

Responsibilities

  • Design, build, and maintain enterprise-grade dashboards for monitoring, observability, and operational insights.
  • Implement intelligent alerting systems across multiple platforms to proactively identify and mitigate issues.
  • Deliver full-stack observability solutions, including monitoring, logging, tracing, and event management integrations.
  • Provide oversight of production operations to maximize service reliability, resiliency, and automation.
  • Define, implement, and continuously evolve SRE practices, procedures, tooling, and runbooks.
  • Monitor system capacity, performance, and health trends; provide analytics, forecasting, and capacity rmendations.
  • Drive a proactive operational model, focusing on prevention and optimization rather than reactive incident response.
  • Design, develop, and roll out CI/CD frameworks across hybrid and multi-cloud environments.
  • Implement Infrastructure as Code (IaC) solutions using Terraform and cloud-native tooling.
  • Facilitate release and deployment management across multiple non-production and production environments.
  • Build, deploy, and manage DevOps pipelines on AWS, Azure, and GCP.
  • Provide day-to-day technical direction and innovation for platform services, with a strong focus on Azure.
  • Enable core platform capabilities including cloud connectivity, infrastructure, and d services at scale.
  • Implement AIOps and data-driven operational tooling and dashboards to improve decision-making and operational efficiency.
  • Identify opportunities to automate repetitive or manual processes; champion automation-first thinking.
  • Identify inefficiencies within Platform Services Operations and lead continuous improvement initiatives.
  • Define, document, and maintain standard operating procedures, runbooks, and architectural documentation.
  • Create and maintain system architecture diagrams and operational documentation using Jira, Confluence, and UML.
  • Translate discussions from troubleshooting, design sessions, and brainstorming meetings into clear architecture diagrams and actionable plans.
  • Ensure operational processes are executed with high attention to detail, speed, and on-time delivery.
  • Act as an out-of-the-box thinker, continuously challenging traditional processes and driving innovation through automation.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service