Principal AI Application Operations Engineer

GE Vernova
$137,700 - $229,600

About The Position

Responsible to drive ongoing management/maintenance of all Digital infrastructure, products, services & ops ensuring business continuity, capacity management, incident response & help desk mgmt. Impacts the team's ability to achieve service, quality and timeliness of objectives. The role carries high levels of operational judgment and accountability, balancing autonomy with alignment to operating policy objectives. Key responsibilities span capacity management, incident response, help desk management, and software quality improvement with a direct impact on the team's ability to deliver services that meet defined standards of quality, timeliness, and business value.

Requirements

  • Minimum of a Bachelor's Degree in a relevant discipline
  • 10+ years of relevant professional experience

Nice To Haves

  • Demonstrated expertise in: SaaS application incident management and enterprise-grade operational management
  • Software Development Lifecycle (SDLC) including Agile, DevOps, and CI/CD practices
  • New Product Introduction (NPI) processes and product roadmap rollout (push to prod)
  • Incident management frameworks (e.g., ITIL) and SLA/SLO governance
  • AI/ML application operations or AI Ops tooling and practices
  • Working with observability tools (e.g., Datadog, Splunk, Dynatrace) and log management systems to troubleshoot issues.
  • Strong oral and written communication skills.
  • Strong interpersonal and leadership skills.
  • Demonstrated ability to analyze and resolve problems.
  • Demonstrated ability to lead programs / projects.
  • Ability to document, plan, market, and execute programs.
  • Established project management skills.
  • Hands-on familiarity with AI/ML platforms, monitoring tools, and cloud infrastructure (e.g., Azure, AWS, GCP)
  • Experiencing establishing support through purchased services agreements and coordination of L1/L2/L3 support.
  • Experience setting up PagerDuty rotations and support workstreams.

Responsibilities

  • Serve as the primary operational owner for AI application infrastructure, monitoring system health, managing incident detection, escalation, and timely resolution in alignment with defined SLA/SLO targets across the DevOps lifecycle
  • Lead Root Cause Analysis (RCA) reviews and Push to Production meetings, ensuring readiness criteria are met, risks are communicated, and corrective actions are tracked to closure
  • Own and report on SLA/SLO performance metrics, providing regular dashboards to leadership and driving data-informed decisions to address trends and service gaps
  • Influence the development of operational strategy for AI applications, including resource planning, policy formulation, and tooling roadmap alignment
  • Proactively monitor industry trends in AI Ops, MLOps, and DevSecOps, translating emerging best practices into actionable improvements for the organization
  • Apply high-level operational judgment to navigate complex technical and business challenges, constructing well-reasoned recommendations for problems that extend beyond standard operating parameters
  • Identify and implement process improvements across deployment pipelines, monitoring frameworks, and support workflows to improve reliability, scalability, and operational efficiency
  • Champion a quality-first culture by defining and enforcing operational standards, quality policies, and best practices across the AI application support team
  • Lead cross-functional teams and projects, communicating complex technical concepts clearly to both technical and non-technical audiences, including senior leadership and external partners
  • Mentor and guide team members, fostering professional growth while serving as a bridge between development, operations, and business stakeholders to ensure alignment on priorities and outcomes

Benefits

  • medical
  • dental
  • vision
  • prescription drug coverage
  • access to Health Coach from GE Vernova, a 24/7 nurse-based resource
  • access to the Employee Assistance Program, providing 24/7 confidential assessment, counseling and referral services
  • GE Vernova Retirement Savings Plan, a tax-advantaged 401(k) savings opportunity with company matching contributions and company retirement contributions, as well as access to Fidelity resources and financial planning consultants
  • tuition assistance
  • adoption assistance
  • paid parental leave
  • disability benefits
  • life insurance
  • 12 paid holidays
  • permissive time off
  • Relocation Assistance Provided
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service