Site Reliability Engineer

Precision Solutions
4dRemote

About The Position

Monitors, analyzes, and optimizes the end-to-end performance of analytics platforms in AWS, proactively identifying trends and potential risks, facilitating root cause analysis for recurring issues, and driving the adoption of automation and resilience improvements to maximize system uptime and user satisfaction for the analytics product suite.

Requirements

  • AWS
  • Site Reliability Engineering
  • Databricks
  • CloudWatch
  • Scripting & IaC tools
  • experience with incident response
  • capacity planning
  • performance tuning
  • automating operational tasks
  • SOP documentation
  • operational best practices

Responsibilities

  • Monitor and analyze analytics platforms in AWS
  • Identify trends and potential risks
  • Facilitate root cause analysis for recurring issues
  • Drive automation and resilience improvements
  • Support incident response, capacity planning, and operational best practices
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service