Site Reliability Engineer (SRE) - II

Huntington BancsharesColumbus, OH
79dHybrid

About The Position

Are you a natural problem solver who thrives in high-pressure situations and enjoys working across teams to keep systems running smoothly? We're looking for a Site Reliability Engineer (SRE) Level II who brings not only technical expertise but also strong communication and collaboration skills to help support and scale our critical systems. This role is ideal for someone who is personable and proactive, with a knack for jumping into complex issues and guiding troubleshooting conversations. You will be part of a team that ensures our systems are resilient, scalable, and well-supported. While technical depth is important, we're especially looking for someone who can lead incident response, communicate clearly, and drive continuous improvement in both systems and processes.

Requirements

  • Bachelor's degree in computer science, Information Technology.
  • 3+ years of experience in site reliability engineering, DevOps, systems administration, or related roles.

Nice To Haves

  • Strong troubleshooting and communication skills in production environments.
  • Experience supporting applications in both .NET and Spring Boot frameworks.
  • Familiarity with OpenShift, Windows Server, and hybrid deployment environments.
  • Proficiency in log analysis using SQL Queries and Splunk.
  • Strong scripting skills (e.g., PowerShell, Bash, Python).
  • Familiarity with cloud platforms (AWS, GCP etc).
  • Hands-on experience with observability tools (Dynatrace, Datadog etc.).
  • Strong interpersonal skills and a customer-focused mindset.

Responsibilities

  • Lead real-time troubleshooting efforts for high-impact production issues.
  • Collaborate across IT and engineering teams to resolve incidents quickly and effectively.
  • Provide mentorship and guidance to junior SREs and support staff.
  • Participate in on-call rotations and act as an escalation point.
  • Build and maintain automation using tools like Terraform, Ansible, or CloudFormation.
  • Eliminate manual tasks and improve reliability through scripting and automation.
  • Build and optimize monitoring dashboards using tools like Prometheus, Dynatrace, Splunk etc.
  • Ensure visibility into system health and proactively detect issues.
  • Drive improvements in deployment, monitoring, and incident response processes.
  • Champion best practices across the SRE and support teams.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Industry

Credit Intermediation and Related Activities

Education Level

Bachelor's degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service