Lead Infrastructure Engineer- Infrastructure Monitoring

JPMorgan Chase & Co.Wilmington, DE

About The Position

We have an exciting opportunity for you to collaborate with passionate professionals, solve complex problems, and grow your career in a supportive, innovative environment. As a Lead Infrastructure Engineer at JPMorgan Chase within Corporate Technology's Enterprise Observability Platforms, you will help build and operate a strategic, market-leading Infrastructure Monitoring platform that strengthens critical service resilience and delivers trusted operational insights. You will be a hands-on technical contributor on an high-performing agile team, building secure, stable, and scalable observability solutions—turning telemetry into actionable insights, modernizing event-to-incident workflows, enabling automation and AIOps-driven reliability improvements aligned to the firm’s business objectives. JPMorgan Chase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management. Our professionals in our Corporate Functions cover a diverse range of areas from finance and risk to human resources and marketing. Our corporate teams are an essential part of our company, ensuring that we’re setting our businesses, clients, customers and employees up for success.

Requirements

  • Formal training or certification on infrastructure engineering concepts and 5+ years applied experience
  • Proficiency with enterprise operating systems (Linux and/or Windows), including administration, troubleshooting, performance analysis, and operational best practices within regulated production environments.
  • Proven hands-on experience delivering and operating enterprise-scale Infrastructure Monitoring solutions across Linux, Windows, and/or Network estates
  • Solid understanding and hands-on implementation of observability and telemetry concepts, including metrics, logs, and events, with experience using OpenTelemetry collection patterns and integrating telemetry into Downstream components
  • Proficiency in automation and engineering practices, including scripting and development with Python, Ansible, PowerShell / Bash, and applying CI/CD-driven workflows for controlled, secure, and repeatable change management.
  • Well-rounded experience in infrastructure across hardware platforms, operating systems, networking, storage, and databases (MS SQL Server, Oracle, Cassandra), including common deployment patterns, integration architectures, scaling and resiliency considerations, and performance assessment.
  • Experience implementing Infrastructure-as-Code (IaC) and configuration management practices using tools such as Terraform, enabling standardized provisioning and scalable, repeatable deployments.
  • Hands-on experience operating in hybrid infrastructure environments, including enterprise on-prem platforms and public/private cloud, with familiarity supporting and migrating monitoring capabilities across cloud boundaries.
  • Demonstrated ability to improve monitoring signal quality through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment, supporting reliable event-to-incident workflows and operational insights.
  • Experience developing, reviewing, debugging, and maintaining secure, high-quality production code and platform configurations, including automation supporting monitoring platforms and platform operations.

Nice To Haves

  • Hands on experience operating one or more enterprise monitoring platforms such as SCOM, Tivoli, SMARTS, IBM Instana, DX NetOps, ITNM ,Netcool Suite
  • Experience with modern observability ecosystems such as Splunk, Dynatrace, Grafana, Prometheus and interoperability patterns for telemetry integration, routing and visualization.
  • Experience with Kubernetes (e.g., EKS) for container orchestration and operations.
  • Experience with topology-driven monitoring and correlation approaches for large-scale infrastructure environments.
  • Knowledge of Event Management & AIOps workflows (noise reduction, anomaly detection, probable cause analysis, guided remediation) with appropriate controls.

Responsibilities

  • Engineer, operate, and continuously improve the firm’s Infrastructure Monitoring platforms, ensuring availability, performance, scalability, and security.
  • Build and run enterprise-grade Infrastructure Monitoring capabilities across Linux, Windows, and complex Network estates, including platform-level onboarding and lifecycle management.
  • Design and implement platform services, integrations, and telemetry collection across metrics, logs, events, including OpenTelemetry collection patterns where applicable.
  • Develop and maintain standardized onboarding patterns (agents/collectors, configurations, dashboards, alert policies) to accelerate safe adoption at scale.
  • Improve monitoring signal quality and usability through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment.
  • Develop secure, high-quality automation and production code; review, debug, and improve code/configuration written by others.
  • Automate platform operations and reduce toil through scripting and CI/CD-driven configuration management; implement infrastructure-as-code deployment patterns
  • Manage & maintain production health for the monitoring platform: lead triage, perform RCA, and deliver preventative engineering and resilience improvements.
  • Partner with infrastructure, application, and SRE teams to align platform capabilities to SLIs/SLOs, operational readiness, and continuous improvement goals.
  • Contribute to a culture of diversity, opportunity, inclusion, and respect.

Benefits

  • competitive total rewards package including base salary determined based on the role, experience, skill set and location.
  • commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions.
  • comprehensive health care coverage
  • on-site health and wellness centers
  • a retirement savings plan
  • backup childcare
  • tuition reimbursement
  • mental health support
  • financial coaching

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service