Lead Analyst, Service Reliability

Intercontinental Exchange Holdings, Inc.Atlanta, GA

About The Position

Intercontinental Exchange (NYSE: ICE) is seeking a Lead Analyst to join the Service Reliability department. This role bridges hands on operational leadership with data driven insight, serving as both an operational subject matter expert and an analytical resource for ICE’s Service Reliability organization spanning Systems Operations, Site Reliability Engineering, Automation Engineering, and Business Continuity. The ideal candidate brings a strong foundation in system operations, incident management alongside the analytical skills to translate complex operational data into clear, actionable intelligence. You will drive consistency across numerous business segments and several supported businesses, champion process improvement, and deliver the reporting and insights leadership needs to make informed reliability and continuity decisions.

Requirements

  • Bachelor’s degree in Information Technology, Business, Business Analytics, Data Science, or a related field, or equivalent hands-on experience.
  • 3+ years of experience in IT operations, systems operations, business analysis, or a combined analytical and operational role within a technology or financial services environment.
  • Demonstrated proficiency with data visualization tools (Tableau, Power BI) and advanced Excel including pivot tables, Power Query, and complex formulas.
  • Hands on experience with ITSM platforms such as ServiceNow, including incident, change, and problem management workflows.
  • Proven ability to define and track operational KPIs and translate metric outcomes into actionable business recommendations.
  • Strong written and verbal communication skills, with the ability to tailor messaging for both technical teams and executive audiences.
  • Highly organized, detail oriented, and capable of managing multiple concurrent workstreams in a dynamic, high accountability environment.

Nice To Haves

  • Operational Expertise: Solid grounding in IT operations, incident management, and service reliability practices; comfortable navigating complex, multi team operational environments.
  • Business Continuity & Resilience: Experience developing and maintaining BCP and DR programs, coordinating continuity exercises, and producing compliance scorecards and recovery documentation.
  • Data Analytics & BI Development: Proven track record building self-service dashboards and automated reporting pipelines that enable operational leaders to act on data independently.
  • Business Analysis Methodology: Familiar with formal BA practices including requirements elicitation, gap analysis, use case development, and stakeholder mapping.
  • Reliability Metrics Fluency: Deep familiarity with SRE concepts and MTTx metrics (MTTR, MTTD, MTTA, MTBF), and the ability to contextualize these in executive and client facing reporting.
  • Process Improvement: Practical experience applying Lean, Six Sigma, or similar frameworks to measure inefficiencies and drive sustainable operational improvements.
  • ITSM & Operational Tooling: Working knowledge of ServiceNow, PagerDuty, BigPanda, and Rundeck; experience querying, analyzing, and reporting on data within these platforms.
  • SQL & Data Querying: Ability to write and optimize SQL queries for operational data extraction and analysis; experience with BI layer or data warehouse access a plus.
  • Cross Functional Communication: Skilled at translating technical concepts for non technical audiences and collaborating across IT, operations, and business leadership to drive aligned outcomes.
  • Project Coordination: Experienced managing multiple initiatives simultaneously, tracking deliverables, and aligning operational workstreams with broader strategic objectives.

Responsibilities

  • Oversee day to day operational processes across the Service Reliability department, ensuring consistency, efficiency, and alignment with ICE standards across all supported business segments.
  • Maintain and improve operational workflows using Lean/Six Sigma principles, mapping current state processes to uncover inefficiencies and drive measurable improvements.
  • Support the onboarding of newly acquired companies into ICE’s Incident Management and Service Reliability frameworks, coordinating integration planning and execution.
  • Operate effectively in a fast paced, dynamic environment where flexibility, clear prioritization, and responsiveness to shifting operational demands are essential.
  • Assist in coordinating cross functional projects, tracking progress, managing timelines, and ensuring operational deliverables align with broader strategic goals.
  • Collaborate with IT, operations, and leadership teams to support change management activities and the rollout of new tools, platforms, and operational processes.
  • Occasional travel may be required to support integration efforts or cross functional collaboration with global teams.
  • Lead Incident Reliability Engineering oversight, ensuring timely, accurate reporting of incident metrics, trends, and outcomes across all supported business segments.
  • Develop and maintain training materials, playbooks, and knowledge base content to support ICE’s Incident Management processes and promote operational best practices.
  • Analyze incident patterns within ServiceNow, PagerDuty, and BigPanda to drive continuous improvement in detection, response, and resolution performance.
  • Define, track, and report on key reliability KPIs including MTTx metrics (MTTR, MTTD, MTTA) to measure and communicate operational effectiveness to leadership.
  • Design, build, and maintain operational dashboards and reports using Tableau, Power BI, and Excel, providing leadership with a clear, real-time view of service reliability performance.
  • Own the data pipeline for Service Reliability metrics including collection, transformation, validation, visualization, and stakeholder delivery.
  • Perform trend analysis and statistical modeling on incident, reliability, and operational data to surface patterns and inform proactive decision making.
  • Develop predictive analytics models to anticipate operational risks and support leadership in making data driven reliability investments.
  • Partner with engineering and operations teams to close data gaps, improve telemetry coverage, and ensure integrity across all reporting sources.
  • Elicit, document, and validate business and operational requirements from stakeholders across IT, operations, and leadership, translating needs into structured frameworks and specifications.
  • Map workflows and information flows across the Service Reliability department to identify automation opportunities, eliminate redundancy, and support efficiency initiatives.
  • Lead requirements gathering for new tooling, platform enhancements, and reporting capabilities, bridging the gap between operational needs and technical implementation.
  • Apply Lean and Six Sigma methodologies to quantify process inefficiencies, measure improvement impact, and build a culture of continuous operational improvement.
  • Lead and maintain the Business Continuity Planning (BCP) program across ICE’s Service Reliability organization, ensuring plans, procedures, and recovery strategies are current, tested, and actionable.
  • Develop and manage BC/DR exercise records, scorecard decks, and reporting deliverables that communicate continuity posture to leadership and external stakeholders.
  • Coordinate BC/DR exercises and tabletop simulations across supported business segments, documenting outcomes, tracking remediation items, and driving closure of identified gaps.
  • Maintain and enhance priority matrix frameworks and failure domain classifications that support recovery time objective (RTO) and recovery point objective (RPO) planning.
  • Collaborate with IT, operations, and business leaders to ensure personnel centric and technology centric continuity considerations are addressed across all recovery plans.
  • Track and report on BCP compliance and readiness across 13+ business segments, producing data driven scorecards that surface risk and drive accountability.
  • Support newly acquired companies in aligning their continuity and resilience programs with ICE’s established BCP and DR frameworks.
  • Communicate operational findings, analytical insights, and reliability trends clearly to both technical and non technical audiences, including senior leadership and external stakeholders.
  • Deliver confident, authoritative instruction and coaching to managers and senior leaders on process optimization, operational metrics, and reliability improvement strategies.
  • Produce high quality stakeholder documentation including executive presentations, quarterly business reviews, operational reports, and customer facing communications.
  • Collaborate across IT, operations, and business teams to ensure operational changes and data initiatives are well communicated, well documented, and adopted effectively.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service