Climavision is seeking a Senior Site Reliability Engineer to contribute towards reliability, operational excellence, and production resilience for our customer-facing platform and weather data services. This role is focused on ensuring our systems consistently meet demanding customer SLAs, including a 99.5% availability commitment for radar-derived data services. A central focus of this role is establishing multi-replica and multi-cluster high availability across our .NET services, including hands-on refactoring of C# code to make services safe to run as multiple instances and across clusters. This is a hands-on engineering role for someone who is equally comfortable debugging production .NET services, troubleshooting Kubernetes clusters, leading incident response, and improving operational maturity across the organization. The successful candidate will combine strong software engineering experience in C# / .NET with deep production operations expertise and a disciplined approach to reliability engineering. Climavision operates a hybrid infrastructure footprint spanning Microsoft Azure, colocation data centers, and edge Kubernetes clusters, deployed alongside weather radar systems. This role will drive production reliability across Azure, colocation, and edge environments. 35% Production Reliability Engineering 30% Application Reliability & .NET Service Architecture 20% Kubernetes Platform Reliability/Operations 15% Observability, Automation, and Operational Excellence
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
Associate degree