Oracle-posted 3 months ago
Senior
5,001-10,000 employees

We are seeking a Principal Technical Program Manager (TPM) to join the Network Reliability Engineering (NRE) organization within Oracle Cloud Infrastructure (OCI). This role is part of the Availability PMO and is pivotal in driving large-scale, cross-organizational programs that improve network resiliency, automation, and AI-driven operations. The ideal candidate will have deep experience leading technical initiatives at scale — orchestrating work across product, engineering, and operations — while driving measurable improvements in reliability, availability, and efficiency. You will also help shape OCI’s transformation toward AI-powered and autonomous network operations, collaborating closely with Network Engineering, GNOC, Automation, Monitoring and AI/ML product teams.

  • 6+ years of experience driving complex technical programs across large-scale cloud or network environments (preferably with 2+ years in AI/ML or automation-related programs).
  • Proven experience leading initiatives in cloud infrastructure, networking, or SRE/NRE domains.
  • Demonstrated success managing AI-enabled operations, including predictive analytics, LLM-based knowledge systems, and self-healing automation.
  • Strong understanding of cloud architecture, networking fundamentals (routing, connectivity, monitoring, telemetry), and data pipeline orchestration.
  • Exceptional leadership and stakeholder management skills — able to influence across engineering, product, and operations at all levels.
  • Strategic thinker with strong analytical and problem-solving skills; able to turn ambiguous goals into measurable execution plans.
  • Excellent written and verbal communication skills, with the ability to synthesize complex technical topics for executive audiences.
  • Technical background with the ability to discuss APIs, ML workflows, data architectures, and automation frameworks with engineering teams.
  • Experience working in an AI/Automation-driven operational environment (e.g., AIOps, MLOps, network observability, or autonomous infrastructure) strongly preferred.
  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related technical field.
  • Experience with AI/ML product integration, LLM-based automation, or AI for Operations (AIOps) tools.
  • Familiarity with Terraform, Python, REST APIs, and cloud platforms (OCI, AWS, Azure, GCP).
  • Strong understanding of operational metrics, incident lifecycle management, and continuous improvement processes.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service