As a Lead Platform Engineer of Observability and Monitoring, you set the technical direction for how enterprise observability and monitoring capabilities are designed, delivered, and consumed across cloud and on premises systems, infrastructure, and applications. Your focus is on enabling application and platform teams to adopt standardized monitoring, logging, and event driven capabilities through well-defined patterns, integrations, and automation. You lead the configuration, customization, and delivery of observability capabilities across enterprise platforms, including application performance monitoring, centralized logging, event correlation, and automated alerting. You establish architectural standards, best practices, and reference designs that ensure monitoring solutions are scalable, resilient, secure, and aligned with enterprise strategy. You provide technical leadership for business and technical analysis, architectural reviews, and complex solution design, partnering closely with stakeholders to translate operational and business requirements into consistent, data driven observability implementations. In this role, you drive enterprise scale adoption of automated monitoring and alerting through CI/CD enablement, configuration as code, and reusable ingestion, dashboard, and visualization patterns for structured and unstructured telemetry data. You lead the design and implementation of alerting and event correlation integrations with ITSM and event management platforms such as BigPanda, ensuring actionable signals flow cleanly from applications and infrastructure into operational workflows. The Lead Platform Engineer is expected to champion automation, security best practices, and continuous improvement in observability capability delivery, with deep hands-on expertise in tools such as Elastic, AppDynamics, and BigPanda. Through mentoring and technical leadership, you enable teams to deliver consistent, scalable, and intelligent monitoring solutions that improve operational visibility, accelerate incident response, and strengthen overall service resilience—without coupling teams to the operational burden of the monitoring platforms themselves.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level