The Senior Monitoring & Observability Engineer (TOC) is a senior-level infrastructure and reliability engineering role responsible for designing, implementing, optimizing, and supporting enterprise monitoring and observability platforms across networks, systems, cloud environments, and critical business applications. The position combines observability engineering, cloud and infrastructure operations, automation, and incident management responsibilities, including ownership of monitoring tools such as Datadog, Splunk, SolarWinds, Dynatrace, AppDynamics, Nagios, PRTG, and Zabbix. Acting as a technical escalation point, the role partners closely with Infrastructure, Security, DevOps, and IT Operations teams to improve system reliability, alert quality, operational efficiency, and service availability while supporting SRE-aligned practices such as automation, root cause analysis, SLIs/SLOs, and continuous operational improvement.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior