Site Reliability Engineer

Geonetric•Cedar Rapids, IA

About The Position

As a Site Reliability Engineer you’ll help build and support the infrastructure that keeps our services fast, available, and cost-effective. As part of the Site Reliability team, you’ll help shape how we build, run, and evolve our platform. You will work closely with development teams to design scalable systems on Azure, drive automation across our operational workflows, and establish observability practices that give teams clear insight into system health. This is a high-impact role at the intersection of engineering and operations.

Requirements

Bachelor’s degree in Computer Science, MIS, Networking or related field or equivalent required
Typically has a minimum of 3 years’ experience
Microsoft Azure or other public cloud experience required
Cloud-based network design, build, and support experience required
Cloud-based platform and systems design, build, and support experience required
Observability platform experience required
CI/CD pipeline experience required

Nice To Haves

Experience supporting high SLA services preferred
Experience configuring and supporting SSO application authentication preferred
Experience securing web applications, networks, and systems preferred
Previous healthcare experience is a plus

Responsibilities

Designs, implements, and maintains automated pipelines for deployment, provisioning, scaling, and remediation using infrastructure-as-code and CI/CD best practices.
Builds and maintains comprehensive monitoring, logging, alerting, and tracing frameworks that give teams full visibility into system performance and incident response.
Partners with development teams to architect resilient, highly available systems on Azure – applying SLO/SLA frameworks and reliability patterns from the ground up.
Forecasts infrastructure demand, defines scaling thresholds, and ensures systems are provisioned to meet performance targets without over-building headroom.
Continuously analyzes Azure spend, identifies waste, right-sizes workloads, and champions FinOps practices that align cloud investment with business outcomes.
Owns the end-to-end incident lifecycle – from detection and triage through resolution and postmortem – fostering a blameless culture and driving systemic improvements to prevent recurrence.
Integrates security practices into platform operations – including threat detection, vulnerability management, compliance monitoring, and hardening of infrastructure against evolving risks.
Completes standard, moderately difficult work independently.
Requires guidance and oversight to complete complex work, solve unexpected issues, and make decisions.
Manages established processes, identifies problems, and solves them with assistance.
Consistently lives our core values: Own It, Bring It, Push It, Say It, Unite.