Senior SRE I

WaystarLouisville, KY
Hybrid

About The Position

Operates company's complex high traffic, business critical internet site communications and/or network-based (cloud) product systems. Plans, designs and implements scalable local and wide-area network solutions between multiple platforms and protocols (including IP and VOIP). Responsible for system performance; supports/troubleshoots network issues and coordinates installation of such items as routers and switches with appropriate vendors. Develops tools to automate the deployment, administration and monitoring of a network system. Provides training and assists with proposal writing. Conducts project planning, cost analysis and vendor comparisons and works on project implementation. Works with development teams to enhance and improve system operability. Conducts tests of network redundancy, resilience and failover of network elements to ensure up-time standards are fully achieved. May be required to provide on-call service coverage with other department employees.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
  • 5+ years of experience in a Site Reliability Engineering, DevOps, or highly related infrastructure engineering role.
  • Strong proficiency in at least one scripting/programming language (e.g., Python, Go, Java, Ruby, Bash).
  • Extensive experience with cloud platforms (AWS, Azure, GCP) including services related to compute, networking, storage, and databases.
  • Deep understanding of Linux operating systems and networking fundamentals.
  • Proven experience with infrastructure as code tools (e.g., Terraform, CloudFormation, Ansible).
  • Solid experience with CI/CD pipelines and related tools (e.g., Jenkins, GitLab CI, GitHub Actions).
  • Demonstrable expertise in monitoring and alerting systems (e.g., Prometheus, Grafana, Datadog, Splunk).
  • Strong problem-solving skills with a methodical approach to debugging complex distributed systems.
  • Excellent communication and collaboration skills, with the ability to work effectively across cross-functional teams.

Nice To Haves

  • Experience with containerization technologies (Docker, Kubernetes) is highly desirable.
  • Familiarity with database technologies (relational and NoSQL) and their operational challenges.

Responsibilities

  • Design, implement, and maintain automation for infrastructure provisioning, configuration management, and application deployments across various environments (on-premise and cloud).
  • Proactively monitor system health, performance, and availability, utilizing a range of observability tools and defining key performance indicators (KPIs) and service level objectives (SLOs).
  • Lead the investigation and resolution of complex production incidents, perform root cause analysis, and implement preventative measures to minimize future occurrences.
  • Collaborate with development teams to ensure software is designed for reliability, scalability, and operational efficiency, participating in architectural reviews and providing expert guidance.
  • Develop and maintain robust incident response procedures, runbooks, and disaster recovery plans.
  • Contribute to the evolution of our SRE practices, tooling, and best standards, driving continuous improvement and knowledge sharing within the team.
  • Participate in an on-call rotation to provide 24/7 support for critical production systems.
  • Mentor junior SREs and contribute to the growth and development of the team.
  • Evaluate and implement new technologies and solutions to enhance system reliability and operational efficiency.

Benefits

  • Competitive total rewards (base salary + bonus, if applicable)
  • Customizable benefits package (3 medical plans with Health Saving Account company match)
  • Generous paid time off for non-exempt team members, starting with 3 weeks + 13 paid holidays, including 2 personal floating holidays.
  • Flexible time off for exempt team members + 13 paid holidays
  • Paid parental leave (including maternity + paternity leave)
  • Education assistance opportunities and free LinkedIn Learning access
  • Free mental health and family planning programs, including adoption assistance and fertility support
  • 401(K) program with company match
  • Pet insurance
  • Employee resource groups
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service