Site Reliability Engineer

TEKsystemsChandler, AZ
$70 - $70Hybrid

About The Position

We are seeking a Senior Site Reliability Engineer (Sr SRE) to provide production support and reliability engineering for multiple enterprise‑scale technology platforms. This role focuses on ensuring the availability, performance, resiliency, and operational stability of systems supporting enterprise security, automation, orchestration, CI/CD pipelines, and cloud platforms. The ideal candidate has deep experience supporting complex, distributed enterprise solutions, excels in incident response and troubleshooting, and collaborates effectively with engineering and product stakeholders to improve system reliability and operational maturity.

Requirements

  • 5–10 years of experience supporting enterprise‑scale production systems.
  • Strong experience supporting multiple enterprise solutions across one or more of the following domains: Enterprise security platforms Automation and orchestration tools Workflow automation CI/CD pipelines Cloud platforms
  • Hands‑on experience with Linux and Windows system administration.
  • Experience with automation and infrastructure‑as‑code tools, such as: Ansible / Ansible Tower Terraform
  • Proficiency in one or more programming or scripting languages, including: Python Java .NET
  • Experience using enterprise monitoring, logging, and observability tools, such as: Dynatrace Splunk Tivoli ITM SiteScope
  • Experience supporting IT service management (ITSM) processes using tools such as ServiceNow or BMC Remedy.
  • Strong analytical, troubleshooting, and documentation skills.
  • Ability to operate effectively in a fast‑paced, highly regulated enterprise environment.

Nice To Haves

  • Experience supporting enterprise security, endpoint, and configuration management platforms.
  • Familiarity with CI/CD tooling such as artifact repositories and pipeline security scanning.
  • Experience supporting hybrid or cloud‑native environments.
  • Exposure to SRE best practices, including: SLIs, SLOs, and error budgets Observability and reliability metrics
  • Experience developing dashboards or operational reporting using analytics or visualization tools (e.g., Tableau).
  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).

Responsibilities

  • Provide Senior‑level production support for multiple enterprise platforms, including participation in on‑call rotations.
  • Monitor system health, availability, and performance to ensure compliance with operational and reliability standards.
  • Serve as an escalation point for complex incidents, performing detailed troubleshooting and root‑cause analysis.
  • Lead or contribute to problem management efforts, including post‑incident reviews and implementation of preventive actions.
  • Collaborate with engineering, product, and operations stakeholders to influence designs that improve reliability, scalability, and supportability.
  • Support and operate enterprise platforms across areas such as: Security and endpoint management Infrastructure automation and configuration management Workflow orchestration CI/CD pipelines and artifact repositories Cloud and hybrid environments
  • Implement automation and scripted solutions to reduce manual effort and improve operational efficiency.
  • Utilize monitoring, logging, and application performance management tools to proactively identify and resolve issues.
  • Manage incidents, changes, and service requests using enterprise IT service management tools.
  • Create and maintain operational documentation, runbooks, dashboards, and standard operating procedures.
  • Support platform upgrades, maintenance activities, and capacity/performance planning initiatives.

Benefits

  • Medical, dental & vision
  • Critical Illness, Accident, and Hospital
  • 401(k) Retirement Plan – Pre-tax and Roth post-tax contributions available
  • Life Insurance (Voluntary Life & AD&D for the employee and dependents)
  • Short and long-term disability
  • Health Spending Account (HSA)
  • Transportation benefits
  • Employee Assistance Program
  • Time Off/Leave (PTO, Vacation or Sick Leave)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service