Site Reliability Engineer

Fiserv•Lincoln, NE

57d•Onsite

About The Position

At Fiserv, we power money movement and commerce for thousands of financial institutions and businesses. As a Site Reliability Engineer, you will ensure the reliability, availability, and performance of mission‑critical applications by building scalable systems, robust automation, and data‑driven operations. You will collaborate across development and infrastructure teams to deliver resilient services that align with the way people live and work today.

Requirements

6+ years of experience in Kubernetes and containerized application operations (e.g., EKS, AKS, GKE), including cluster scaling, networking, and workload orchestration.
6+ years of experience in public cloud platforms (AWS, Azure, or GCP) across compute, storage, networking, IAM, and cost optimization.
6+ years of experience in observability/APM and log analytics using tools such as Dynatrace, Splunk, Prometheus, Grafana, Datadog, and ExtraHop.
6+ years of experience in implementing security and compliance controls in regulated environments (e.g., PCI DSS, SOC 2), including secrets management and vulnerability remediation.
5+ years of experience in Infrastructure as Code (Terraform, CloudFormation, Ansible) for provisioning and configuration management.
5+ years of experience in CI/CD pipeline engineering using Jenkins, GitLab CI, GitHub Actions, or Azure DevOps.
5+ years of experience in scripting and automation using Bash, PowerShell, and Python.
4+ years of equivalent combination of educational background, related experience, and/or military experience

Nice To Haves

Certifications such as AWS Certified SysOps Administrator, AWS DevOps Engineer, Google Professional Cloud DevOps Engineer, or Certified Kubernetes Administrator (CKA).
Experience with Premier applications, IBM iSeries, and/or Unisys systems in enterprise environments.
Hands‑on database performance tuning and operations (Oracle, SQL Server, PostgreSQL).
Advanced incident command leadership and stakeholder communication during major incidents.
Experience with ITIL/ServiceNow change, problem, and configuration management processes.

Responsibilities

Design and implement solutions that improve application reliability, scalability, and performance.
Build and maintain monitoring, alerting, and telemetry to enable proactive incident detection and swift resolution.
Lead incident response, conduct root cause analysis, and drive measurable post‑mortem improvements.
Automate operational workflows using scripting and configuration management tools.
Analyze capacity and performance trends to forecast demand and optimize cost.
Partner with development and infrastructure teams to embed operability, resilience, and security in application designs.
Support safe deployments via CI/CD pipelines, change control, and release governance.
Maintain clear runbooks, architecture diagrams, and operational documentation.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume