About The Position

We are seeking a senior, hands-on AIOps & Automation Solution Architect to act as the technical backup to the Engagement Director. This role designs and implements end-to-end AIOps/observability and automation architectures, integrates monitoring platforms with ITSM, defines dashboards and KPIs, and delivers self-healing workflows. The role requires strong customer-facing capabilities, including architecture defense, RFP solutioning, and leadership of onshore and offshore engineering teams.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related discipline (or equivalent practical experience).
  • 12+ years of experience in IT operations, managed services, application support, or transformation roles with significant architecture responsibility.
  • Proven hands-on experience implementing AIOps and observability solutions across at least two enterprise platforms (e.g., Splunk, Dynatrace, Datadog, AppDynamics, Elastic).
  • Strong expertise in ITSM and ITIL processes, with demonstrated experience integrating monitoring, event management, and automation into ITSM platforms.
  • Solid background in automation and orchestration, including scripting proficiency (Python, shell, and/or PowerShell) for prototyping and integrations.
  • Experience designing or enabling GenAI and agentic AI use cases in IT operations, such as assisted triage, knowledge grounding, and runbook co-pilots.
  • Excellent communication and stakeholder management skills, with the ability to present, influence, and defend technical solutions with customer and executive audiences.

Responsibilities

  • Architect and deliver end-to-end AIOps and observability solutions, covering data collection, ingestion, correlation, analytics, dashboards, and operational workflows.
  • Design and implement integrations between monitoring/observability platforms and ITSM tools using APIs and service interfaces.
  • Define event and alert management strategies, including de-duplication, noise reduction, anomaly detection, root-cause analysis, and actionable alerting.
  • Design, operationalize, and govern self-healing and runbook automation workflows triggered by events and incidents within ITIL-aligned processes.
  • Establish dashboards, KPIs, and SLA reporting frameworks; define measurement models to track operational efficiency, business outcomes, and ROI.
  • Lead technical POVs, demos, and architecture reviews; conduct tool evaluations and defend solution designs with senior customer stakeholders.
  • Guide onshore and offshore engineering teams through architecture standards, HLD/LLD creation, backlog prioritization, and delivery governance; support RFP solutioning with architecture, roadmap, and estimates.

Benefits

  • Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
  • Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
  • Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
  • Life and disability insurance
  • Employee assistance programs
  • Other benefits as provided by local policy and eligibility
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service