Site Reliability Engineer II

AkamaiCambridge, MA
CRC 16,932,850 - CRC 30,479,150Hybrid

About The Position

The Platform & Reliability Engineering team is responsible for defining, measuring, & optimizing the key performance indicators of delivery customers. Your expertise in software engineering and systems administration will be instrumental in building robust and resilient infrastructure. In this role, you'll play a pivotal role in shaping the future of our products. You'll collaborate closely with product teams to ensure the reliability, scalability, and performance of our systems. You'll define key performance indicators (KPIs). Advance the state of monitoring, alerting and operational responses, and investigate complex performance issues.

Requirements

  • 2 years of relevant experience and a Bachelor's degree in Computer Science or its equivalent
  • Hands-on experience with compute platforms such as Kubernetes, Containerization, and Docker
  • Experience with monitoring and alerting systems (e.g., Prometheus, Grafana, ADBMS, Datadog), including metric collection, alerting, dashboarding, and troubleshooting
  • Show fluency working in a UNIX/Linux computing environment
  • Familiarity with infrastructure-as-code tools such as Terraform
  • Proficiency with a configuration management tool such as Ansible, Salt Stack, Chef, Puppet, or similar

Responsibilities

  • Working on Internet technologies to improve the performance, availability, and scalability of large distributed content delivery systems
  • Engaging in collaborative efforts with cross-functional teams to define and establish measurable Service Level Indicators and Service Level Objectives
  • Monitoring platform availability and performance, debug issues by leveraging data analysis skills and implement corrective actions to avoid recurrence
  • Developing and implement automation solutions to improve operational efficiency and reduce toil.
  • Improving CI/CD pipelines and safe deployment practices for platform services.
  • Participating in design reviews and providing technical guidance to ensure designs meet requirements for scalability, performance, and robustness

Benefits

  • healthcare
  • 401K savings plan
  • company holidays
  • vacation (in the form of PTO)
  • sick time
  • family friendly benefits including parental leave
  • employee assistance program including a focus on mental and financial wellness
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service