Lead Production Support Analyst

TransamericaCedar Rapids, IA
1d$114,000 - $140,000Hybrid

About The Position

The Lead Production Support & Operations role is responsible for end-to-end production support management for a defined line of business (Individual Solutions and WFG) ensuring availability, stability, performance, and operational excellence for business-critical applications and services. This Lead oversees a vendor/contractor production team, drives incident/problem/change rigor, and delivers measurable improvements through automation, monitoring enhancements, and operational standardization. This role is preferred to be hands-on (or strongly technically fluent) with the ability to guide triage, diagnose complex issues across application/infrastructure/database layers, and partner effectively with engineering, infrastructure, security, and business stakeholders.

Requirements

  • 8+ years in production support, IT operations, cloud operations, or SRE/Platform operations, with 3+ years in a lead role (team lead, service owner, or vendor lead).
  • Strong knowledge of ITSM/ITIL practices and hands-on experience with ServiceNow (Inc/Prob/Chg; Event Mgmt preferred).
  • Demonstrated ability to lead high-severity incident response, drive cross-functional execution, and ensure disciplined RCA/PIR completion.
  • Proven experience managing vendor/contractor teams, including performance management through KPIs, governance routines, and continuous improvement plans.
  • Technical fluency across applications, infrastructure, cloud, and database layers, able to guide triage and validate solutions.
  • Strong documentation skills: runbooks, SOPs, support models, escalation procedures, and operational readiness checklists.
  • Excellent communication skills able to translate complex technical events into business impact and executive-ready updates.

Nice To Haves

  • Experience supporting financial services/insurance applications and regulated environments (audit, evidence capture, change controls).
  • Experience implementing automation (runbook automation, scripting, auto-remediation) and improving observability practices.
  • Exposure to SLO/SLI definitions, reliability reporting, and operational scorecards.
  • Experience with multi-sourced/global delivery models and coordinating across time zones.
  • Bachelor’s degree in information technology, Computer Science, or related field (or equivalent experience); advanced degree a plus.

Responsibilities

  • Operational & Production Support Leadership Lead day-to-day production support operations for Individual Solutions & WFG applications/services, ensuring high availability, performance, and stability.
  • Act as the accountable owner for the production support operating model, including L1/L2/L3 routing, on-call rotations, escalation paths, and SLAs/SLOs.
  • Oversee and coach a vendor/contractor support team, ensuring quality execution, clear accountability, and consistent outcomes across shifts/time zones.
  • Own application onboarding into production support: ensure runbooks, SOPs, architecture diagrams, support metrics, monitoring/alerting, access, and DR/backup readiness are complete and current.
  • Establish operational readiness standards across logging, monitoring, access controls, backup, disaster recovery, and maintenance windows.
  • Vendor Management & Service Delivery Manage vendor performance (tickets, SLAs, MTTR, quality of RCAs, repeat incidents, documentation hygiene) and drive continuous service improvement.
  • Run recurring vendor governance: operational reviews, KPI scorecards, backlog prioritization, and corrective action plans.
  • Coordinate with third-party providers for escalations, service requests, planned maintenance, patching, and production changes.
  • Incident, Problem & Change Management Serve as the primary escalation point for high-severity incidents; lead war rooms/bridge calls and drive timely resolution with strong communication.
  • Ensure Root Cause Analysis (RCA) and Post-Incident Reviews (PIRs) are completed with actionable remediation, prevention plans, and measurable follow-through.
  • Drive problem management: identify patterns and recurring issues using incident history, logs, and metrics; reduce repeat incidents through permanent fixes.
  • Oversee change/release execution to minimize production risk: pre-change validation, approvals, rollback plans, post-release monitoring, and “go/no-go” decision support.
  • Ensure adherence to ITSM processes and audit-ready evidence for incident/change/problem workflows.
  • Monitoring, Observability & Reliability Improve detection and response through dashboards, health checks, distributed tracing/APM, synthetic monitoring, and log correlation.
  • Tune alerting to reduce noise and improve signal-to-noise; implement event correlation to prevent alert storms.
  • Partner with engineering and platform teams to define/track error (where applicable), and reliability improvements.
  • Continuous Improvement, Automation & Incident Reduction Proactively identify opportunities for automation (self-healing, auto-remediation, runbook automation, standardized scripts) that reduce toil and improve MTTR.
  • Drive operational standardization: repeatable onboarding, consistent runbooks, automated checks, and common monitoring patterns.
  • Lead initiatives focused on reducing incident volume, shortening recovery times, improving release quality, and removing manual steps from common procedures.

Benefits

  • Competitive Pay
  • Bonus for Eligible Employees
  • Benefits Package
  • Pension Plan
  • 401k Match
  • Employee Stock Purchase Plan
  • Tuition Reimbursement
  • Disability Insurance
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • Employee Discounts
  • Career Training & Development Opportunities
  • Paid Time Off starting at 160 hours annually for employees in their first year of service.
  • Ten (10) paid holidays per year (typically mirroring the New York Stock Exchange (NYSE) holidays).
  • Be Well Company holistic wellness program, which includes Wellness Coaching and Reward Dollars
  • Parental Leave – fifteen (15) days of paid parental leave per calendar year to eligible employees with at least one year of service at the time of birth, placement of an adopted child, or placement of a foster care child.
  • Adoption Assistance
  • Employee Assistance Program
  • Back-Up Care Program
  • PTO for Volunteer Hours
  • Employee Matching Gifts Program
  • Employee Resource Groups
  • Inclusion and Diversity Programs
  • Employee Recognition Program
  • Referral Bonus Programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service