SRE Engineer

Infotree Global SolutionsCapon Bridge, WV
1d

About The Position

We are seeking a skilled and independent Site Reliability Engineer to join our client’s engineering team for a project-based engagement focused on Production Engineering and Site Reliability Engineering (SRE). This role requires a proven ability to deliver technical solutions, triage and resolve complex production issues, and work independently while collaborating with engineering and infrastructure teams when necessary. The engagement is deliverable-focused, aimed at improving system reliability, automation, and operational efficiency within a high-performance production environment.

Requirements

  • Proficiency in one or more programming languages such as Python, C++, Java, or shell scripting.
  • Strong understanding of Linux operating system internals.
  • Solid knowledge of networking concepts and troubleshooting.
  • Experience with modern version control systems such as Git.
  • At least 1 year of professional software development or reliability engineering experience.
  • Ability to work independently, manage priorities effectively, and deliver results with minimal supervision.
  • Strong analytical mindset with the ability to diagnose and resolve issues in complex production environments.
  • Excellent written and verbal communication skills, with the ability to clearly communicate technical topics to engineering stakeholders.
  • Ability to quickly learn new technologies and tools and work across multiple programming languages and frameworks.

Nice To Haves

  • Familiarity with monitoring, logging, and CI/CD tools (e.g., Prometheus, Grafana, Splunk, Jenkins, GitLab CI) is highly beneficial.

Responsibilities

  • Design, develop, test, and deploy automation tools, scripts, and engineering solutions to improve the stability, performance, and efficiency of production systems.
  • Identify opportunities to automate manual operational processes and reduce operational overhead.
  • Support and improve the release and deployment lifecycle of applications, ensuring reliable and controlled production rollouts.
  • Collaborate with software engineers and infrastructure teams to troubleshoot and resolve system issues.
  • Contribute to system design discussions, platform management, and capacity planning.
  • Create and maintain clear technical documentation for automation tools, operational procedures, and reliability improvements.
  • Provide regular updates on progress and deliverables to engineering stakeholders.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service