About The Position

As Site Reliability and Operations Engineer (SRE), you’ll be part of the action—working closely with application teams to automate operations, optimize infrastructure, and solve issues in an exciting, fast-paced environment. You’ll play a vital role in ensuring that our systems are reliable, scalable, and high-performing. THIS ROLE IS DESIGNED FOR DRIVEN INDIVIDUALS WHO: - Love learning new technologies and thrive in solving sophisticated challenges. - Are independent, motivated, and excited to take on ambitious projects. - Excel at collaborating with engineering teams and can stay calm under pressure. - Have a passion for delivering quality, reliable solutions in a dynamic, high-energy workplace We are seeking dedicated Site Reliability Engineers (SREs) at all levels of experience, from junior to senior, to join our teams.

Requirements

  • 3+ years of experience in Site Reliability Engineering, DevOps, Software Engineering, or a related field
  • Strong foundation in programming language (Java) or scripting (Python / Bash / LUA)
  • Hands on experience in one or more databases (Relational / NoSQL like Oracle, MongoDB)
  • Bachelor’s or Master’s degree in Computer Science or a related field (equivalent practical experience)

Nice To Haves

  • Hands on experience with monitoring and logging tools (e.g., Prometheus, Splunk, Grafana, CloudWatch)
  • Proficient in Linux, Networking concepts (TLS/SSL, DNS, Load Balancers, etc..) and troubleshooting skills in large scale environments
  • Source control management such as Git / Understanding of CI/CD, Release Engineering and DevOps
  • Understanding of security standards, policies, and cryptography
  • Experience with Incident / Problem management and RCA
  • Strong Network, Load Balancing (Nginx, Envoy, NetScaler) experience is a huge plus
  • Good solid understanding using Kubernetes concepts such as networking, Storage, Secrets, Deployments, Containers. AWS or GCP are preferred.
  • Knowledge or experience in Governance and Compliance.
  • Understanding of SRE principles, including observability, error budgeting, service reliability measurements through SLA & SLO & SLI, corresponding telemetry standards and practices, and product feedback.
  • Strong analytical skills
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service