Site Reliability Engineer (Remote)

Libertex Group
Remote

About The Position

We're looking for an SRE Engineer to support and optimize large-scale distributed systems, ensuring high availability, performance, and reliability across production environments. You will monitor system health, troubleshoot complex issues, and drive improvements through automation, observability, and site reliability engineering best practices.

Requirements

  • Strong SQL skills (T-SQL preferred), including query optimization, performance tuning, and data integrity management.
  • Hands-on experience with Microsoft SQL Server, database design, migrations, and partitioning strategies.
  • Experience with monitoring and observability tools such as Prometheus, Grafana, and ELK.
  • Familiarity with cloud platforms (AWS, GCP, Azure).
  • Proficiency in Python and scripting (Bash/PowerShell) for automation, ETL processes, data manipulation, and API integrations.
  • Basic understanding of networking concepts and protocols (HTTP, DNS, CDN).
  • Experience with Apache Airflow, Docker, Kubernetes, Ansible/IaC, and CI/CD tools (GitLab, Jenkins).
  • Strong communication and collaboration skills, with a proactive, problem-solving mindset.
  • English level: Intermediate (B1) or higher.
  • Experience with Airflow, Docker, Kubernetes, Ansible/IaC, and CI/CD pipelines.
  • Strong communication skills and a proactive approach to problem-solving.
  • English level: B1+.

Responsibilities

  • Identify, analyze, and resolve issues in production and non-production systems.
  • Participate in incident response, root cause analysis, and follow-up actions.
  • Take part in an on-call rotation and support production incidents when needed, including outside regular working hours.
  • Help develop and improve the observability system.
  • Collect and analyze metrics from operating systems, infrastructure, and applications.
  • Use monitoring data to support performance tuning, fault finding, and capacity planning.
  • Implement, maintain, and improve CI/CD processes.
  • Create sustainable systems and services through automation and continuous improvement.
  • Reduce manual work and improve operational efficiency.
  • Partner with development teams to improve service reliability, testing, deployment, and release processes.
  • Support platform stability, scalability, and operational readiness.
  • Work closely with development, QA, infrastructure, and other cross-functional teams.
  • Create and maintain clear technical documentation, runbooks, operational guides, and support procedures.

Benefits

  • Quarterly bonuses based on Company performance
  • 24 working days of annual leave
  • Corporate events and team building activities
  • Udemy Business unlimited membership & language training courses
  • Professional and personal development opportunities in a fast-growing environment
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service