Lead Platform Engineer

Transamerica•Cedar Rapids, IA

46d•Hybrid

About The Position

As a Lead Platform Engineer of Observability and Monitoring, you set the technical direction for how enterprise observability and monitoring capabilities are designed, delivered, and consumed across cloud and on premises systems, infrastructure, and applications. Your focus is on enabling application and platform teams to adopt standardized monitoring, logging, and event driven capabilities through well-defined patterns, integrations, and automation. You lead the configuration, customization, and delivery of observability capabilities across enterprise platforms, including application performance monitoring, centralized logging, event correlation, and automated alerting. You establish architectural standards, best practices, and reference designs that ensure monitoring solutions are scalable, resilient, secure, and aligned with enterprise strategy. You provide technical leadership for business and technical analysis, architectural reviews, and complex solution design, partnering closely with stakeholders to translate operational and business requirements into consistent, data driven observability implementations. In this role, you drive enterprise scale adoption of automated monitoring and alerting through CI/CD enablement, configuration as code, and reusable ingestion, dashboard, and visualization patterns for structured and unstructured telemetry data. You lead the design and implementation of alerting and event correlation integrations with ITSM and event management platforms such as BigPanda, ensuring actionable signals flow cleanly from applications and infrastructure into operational workflows. The Lead Platform Engineer is expected to champion automation, security best practices, and continuous improvement in observability capability delivery, with deep hands-on expertise in tools such as Elastic, AppDynamics, and BigPanda. Through mentoring and technical leadership, you enable teams to deliver consistent, scalable, and intelligent monitoring solutions that improve operational visibility, accelerate incident response, and strengthen overall service resilience—without coupling teams to the operational burden of the monitoring platforms themselves.

Requirements

Bachelor’s degree in computer science, Information Technology, a related field or equivalent education/experience and 8–10+ years of related work experience
Demonstrated ability to lead the design and enforcement of monitoring standards in collaboration with application teams (AppDynamics, Elastic Stack, CloudWatch, Site24x7)
Extensive experience architecting, engineering, and scaling distributed telemetry pipelines (Elastic ingestion, data normalization, dashboards)
Expert level proficiency configuring alert normalization, enrichment, and correlation patterns at enterprise scale
Advanced experience with the Open Integration Hub, webhook based and API driven event ingestion
Deep understanding of the BigPanda incident lifecycle, correlation models, and automated routing to ServiceNow
Expert understanding of logs, metrics, traces, and observability concepts (APM, RUM, synthetic monitoring)
Proven ability to design, configure, and optimize AI driven workflows (automated incident analysis, similar incidents, change risk scoring)
Strong familiarity with vector DB concepts, enrichment pipelines, and generative AI guardrails
Advanced knowledge of SSO, OAuth, API Gateway patterns, and secured data flows
Expert level AWS experience (Lambda, S3, API Gateway, CloudWatch, IAM)
Demonstrated ability to interpret telemetry, identify patterns proactively, and influence engineering outcomes
Advanced AI Prompt Engineering Proficiency
Extensive experience interacting with large language models and incorporating them into platform workflows
Proven experience as a Lead Platform Engineer or similar role (i.e. M365, AWS, or Azure Engineer).
Expert understanding of cloud technologies, DevOps processes, and large-scale automation of services.
Extensive experience with CI/CD tools and practices (i.e. Jenkins, Azure Pipelines, etc.).
Advanced experience with automation and scripting tools (i.e. PowerShell, Graph API, etc.)

Nice To Haves

Hands on leadership experience with BigPanda and Biggy AI implementations
Deep expertise with Elastic and its advanced platform capabilities
Experience leading monitoring and logging integrations with ServiceNow at scale
Strong knowledge of security best practices in platform and cloud engineering
Advanced certifications in cloud platforms (GCP, AWS, Azure, M365).
Proven ability to mentor, coach, and technically lead engineers across teams

Responsibilities

Lead the design, development, and evolution of monitoring solutions in support of IT operations systems, infrastructure, and applications, Cloud and On premises
Provide technical leadership for business and technical analysis and architectural reviews with customers.
Lead and continuously improve enterprise scale continuous integration/continuous delivery (CI/CD) processes and pipelines.
Drive strategy and implementation of automated monitoring and alerting across platforms and services
Oversee the design and development of ingest pipelines, visualizations, and dashboard capabilities for structured and unstructured data.
Lead the design and implementation of triggered alert functionality, including on screen alerts and event integrations with ITSM and Event Management Platforms
Provide escalation support and leadership for day-to-day Request and Incident ticket work as necessary
Lead collaboration with stakeholders to gather requirements, develop solution designs, and ensure scalability, resiliency, and efficiency of platform architectures.
Establish and govern system guidelines, process documentation, and training materials for the organization.
Proactively assess and lead responses to emerging requirements and ambiguous technology decisions.
Lead and coordinate IT and business unit projects related to platform and collaboration solutions, including acquisitions, divestitures, and migrations.

Benefits

Compensation
Benefits Package
Pension Plan
401k Match
Employee Stock Purchase Plan
Tuition Reimbursement
Disability Insurance
Medical Insurance
Dental Insurance
Vision Insurance
Employee Discounts
Career Training & Development Opportunities
Health and Work/Life Balance Benefits
Paid Time Off starting at 160 hours annually for employees in their first year of service.
Ten (10) paid holidays per year (typically mirroring the New York Stock Exchange (NYSE) holidays).
Be Well Company holistic wellness program, which includes Wellness Coaching and Reward Dollars
Parental Leave – fifteen (15) days of paid parental leave per calendar year to eligible employees with at least one year of service at the time of birth, placement of an adopted child, or placement of a foster care child.
Adoption Assistance
Employee Assistance Program
Back-Up Care Program
PTO for Volunteer Hours
Employee Matching Gifts Program
Employee Resource Groups
Inclusion and Diversity Programs
Employee Recognition Program
Referral Bonus Programs