Lead Platform Engineer

TransamericaCedar Rapids, IA
11hHybrid

About The Position

As a Lead Platform Engineer of Observability and Monitoring, you set the technical direction for how enterprise observability and monitoring capabilities are designed, delivered, and consumed across cloud and on premises systems, infrastructure, and applications. Your focus is on enabling application and platform teams to adopt standardized monitoring, logging, and event driven capabilities through well-defined patterns, integrations, and automation. You lead the configuration, customization, and delivery of observability capabilities across enterprise platforms, including application performance monitoring, centralized logging, event correlation, and automated alerting. You establish architectural standards, best practices, and reference designs that ensure monitoring solutions are scalable, resilient, secure, and aligned with enterprise strategy. You provide technical leadership for business and technical analysis, architectural reviews, and complex solution design, partnering closely with stakeholders to translate operational and business requirements into consistent, data driven observability implementations. In this role, you drive enterprise scale adoption of automated monitoring and alerting through CI/CD enablement, configuration as code, and reusable ingestion, dashboard, and visualization patterns for structured and unstructured telemetry data. You lead the design and implementation of alerting and event correlation integrations with ITSM and event management platforms such as BigPanda, ensuring actionable signals flow cleanly from applications and infrastructure into operational workflows. The Lead Platform Engineer is expected to champion automation, security best practices, and continuous improvement in observability capability delivery, with deep hands-on expertise in tools such as Elastic, AppDynamics, and BigPanda. Through mentoring and technical leadership, you enable teams to deliver consistent, scalable, and intelligent monitoring solutions that improve operational visibility, accelerate incident response, and strengthen overall service resilience—without coupling teams to the operational burden of the monitoring platforms themselves.

Requirements

  • Bachelor’s degree in computer science, Information Technology, a related field or equivalent education/experience and 8–10+ years of related work experience
  • Demonstrated ability to lead the design and enforcement of monitoring standards in collaboration with application teams (AppDynamics, Elastic Stack, CloudWatch, Site24x7)
  • Extensive experience architecting, engineering, and scaling distributed telemetry pipelines (Elastic ingestion, data normalization, dashboards)
  • Expert level proficiency configuring alert normalization, enrichment, and correlation patterns at enterprise scale
  • Advanced experience with the Open Integration Hub, webhook based and API driven event ingestion
  • Deep understanding of the BigPanda incident lifecycle, correlation models, and automated routing to ServiceNow
  • Expert understanding of logs, metrics, traces, and observability concepts (APM, RUM, synthetic monitoring)
  • Proven ability to design, configure, and optimize AI driven workflows (automated incident analysis, similar incidents, change risk scoring)
  • Strong familiarity with vector DB concepts, enrichment pipelines, and generative AI guardrails
  • Advanced knowledge of SSO, OAuth, API Gateway patterns, and secured data flows
  • Expert level AWS experience (Lambda, S3, API Gateway, CloudWatch, IAM)
  • Demonstrated ability to interpret telemetry, identify patterns proactively, and influence engineering outcomes
  • Advanced AI Prompt Engineering Proficiency
  • Extensive experience interacting with large language models and incorporating them into platform workflows
  • Proven experience as a Lead Platform Engineer or similar role (i.e. M365, AWS, or Azure Engineer).
  • Expert understanding of cloud technologies, DevOps processes, and large-scale automation of services.
  • Extensive experience with CI/CD tools and practices (i.e. Jenkins, Azure Pipelines, etc.).
  • Advanced experience with automation and scripting tools (i.e. PowerShell, Graph API, etc.)

Nice To Haves

  • Hands on leadership experience with BigPanda and Biggy AI implementations
  • Deep expertise with Elastic and its advanced platform capabilities
  • Experience leading monitoring and logging integrations with ServiceNow at scale
  • Strong knowledge of security best practices in platform and cloud engineering
  • Advanced certifications in cloud platforms (GCP, AWS, Azure, M365).
  • Proven ability to mentor, coach, and technically lead engineers across teams

Responsibilities

  • Lead the design, development, and evolution of monitoring solutions in support of IT operations systems, infrastructure, and applications, Cloud and On premises
  • Provide technical leadership for business and technical analysis and architectural reviews with customers.
  • Lead and continuously improve enterprise scale continuous integration/continuous delivery (CI/CD) processes and pipelines.
  • Drive strategy and implementation of automated monitoring and alerting across platforms and services
  • Oversee the design and development of ingest pipelines, visualizations, and dashboard capabilities for structured and unstructured data.
  • Lead the design and implementation of triggered alert functionality, including on screen alerts and event integrations with ITSM and Event Management Platforms
  • Provide escalation support and leadership for day-to-day Request and Incident ticket work as necessary
  • Lead collaboration with stakeholders to gather requirements, develop solution designs, and ensure scalability, resiliency, and efficiency of platform architectures.
  • Establish and govern system guidelines, process documentation, and training materials for the organization.
  • Proactively assess and lead responses to emerging requirements and ambiguous technology decisions.
  • Lead and coordinate IT and business unit projects related to platform and collaboration solutions, including acquisitions, divestitures, and migrations.

Benefits

  • Compensation
  • Benefits Package
  • Pension Plan
  • 401k Match
  • Employee Stock Purchase Plan
  • Tuition Reimbursement
  • Disability Insurance
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • Employee Discounts
  • Career Training & Development Opportunities
  • Health and Work/Life Balance Benefits
  • Paid Time Off starting at 160 hours annually for employees in their first year of service.
  • Ten (10) paid holidays per year (typically mirroring the New York Stock Exchange (NYSE) holidays).
  • Be Well Company holistic wellness program, which includes Wellness Coaching and Reward Dollars
  • Parental Leave – fifteen (15) days of paid parental leave per calendar year to eligible employees with at least one year of service at the time of birth, placement of an adopted child, or placement of a foster care child.
  • Adoption Assistance
  • Employee Assistance Program
  • Back-Up Care Program
  • PTO for Volunteer Hours
  • Employee Matching Gifts Program
  • Employee Resource Groups
  • Inclusion and Diversity Programs
  • Employee Recognition Program
  • Referral Bonus Programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service