You will be responsible for designing, building, and maintaining observability platform tools and frameworks that enable development and operations teams to monitor and improve the performance, availability, and reliability of systems. This role involves designing and implementing systems that monitor and analyze the performance/health of software applications and infrastructure, ensuring high availability and reliability. The engineer will collaborate closely with development, site reliability engineering, DevOps, and infrastructure teams to deliver a seamless observability ecosystem. Key responsibilities include architecting observability platforms, integrating monitoring tools into software pipelines, ensuring system health visibility, reducing mean time to detection (MTTD), and promoting a culture of proactive monitoring and reliability engineering.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
1,001-5,000 employees