About The Position

Oracle Cloud Infrastructure (OCI) is building the future of cloud operations through Incident Response Center as a Service (IRCaaS), a strategic initiative focused on transforming incident management from a highly manual process into an AI-driven, automated platform. Today, thousands of customer-impacting incidents are managed across OCI every year, with a significant portion of engineering effort spent on coordination, communication, workflow execution, and post-incident analysis rather than direct remediation. IRCaaS is designed to automate and orchestrate these activities through intelligent workflows, AI-assisted decision making, and scalable platform services that improve operational efficiency and accelerate incident resolution. As a Software Engineer 3, you will play a key role in designing and building the platform capabilities that power this transformation. You will work across automation, orchestration, user experience, governance, AI-enabled workflows, and cloud-native services to create a highly scalable system that supports the full incident lifecycle—from detection and declaration through investigation, response, recovery, implementation, and closure. This role offers a unique opportunity to work on one of OCI's strategic operational transformation initiatives while developing expertise in AI-driven systems, cloud-scale automation, distributed services, and operational excellence. Engineers on this team will help shape the future of autonomous cloud operations and will be well positioned to influence future AI initiatives across the broader organization.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, or related technical field, or equivalent practical experience.
  • 3-5+ years of professional software engineering experience developing production-quality applications or services.
  • Strong programming skills in Java, Python, Go, C#, or similar object-oriented programming languages.
  • Experience designing, developing, and maintaining distributed systems, microservices, or cloud-native applications.
  • Experience building backend services, APIs, workflow systems, automation platforms, or operational tooling.
  • Knowledge of software engineering fundamentals including data structures, algorithms, testing, debugging, and performance optimization.
  • Experience working with service integrations, event-driven architectures, or asynchronous processing patterns.
  • Familiarity with cloud computing concepts and large-scale production systems.
  • Strong problem-solving, communication, and collaboration skills.
  • Duties and tasks are varied and complex needing independent judgment.
  • Fully competent in own area of expertise.
  • May have project lead role and or supervise lower level personnel.
  • BS or MS degree or equivalent experience relevant to functional area.
  • 4 years of software engineering or related experience.

Nice To Haves

  • Experience with Oracle Cloud Infrastructure (OCI), AWS, Azure, or Google Cloud Platform.
  • Experience building workflow automation, orchestration platforms, or event-driven systems.
  • Experience with observability, monitoring, telemetry, logging, tracing, incident management, or service health platforms.
  • Exposure to AI/ML-enabled applications, LLM integrations, retrieval systems, agent-based workflows, or intelligent automation solutions.
  • Experience integrating operational systems such as ticketing platforms, knowledge bases, collaboration tools, or workflow engines.
  • Experience developing internal platforms, developer tools, or operational automation frameworks.
  • Familiarity with Kubernetes, Docker, Terraform, or infrastructure-as-code technologies.
  • Experience working in highly available, mission-critical, or regulated production environments.
  • Interest in applying AI technologies to improve engineering productivity, operational efficiency, and customer outcomes.

Responsibilities

  • Design, develop, test, deploy, and operate cloud-native software services that support AI-powered operational automation and cloud management workflows.
  • Build intelligent automation, orchestration, and workflow capabilities that reduce manual effort and improve operational efficiency.
  • Develop AI-assisted services that help users gather context, investigate issues, coordinate response activities, summarize operational state, and recommend next actions.
  • Create scalable backend services, APIs, data models, and user-facing experiences that support critical operational workflows.
  • Build integrations across telemetry platforms, observability systems, operational data sources, ticketing systems, collaboration tools, notifications, and workflow engines.
  • Develop event-driven architectures and automation frameworks that improve reliability, consistency, and auditability of operational processes.
  • Collaborate with product managers, engineers, SREs, and operational stakeholders to define requirements and deliver high-impact solutions.
  • Contribute to platform governance, security, compliance, and operational excellence practices.
  • Analyze production issues, identify opportunities for automation, and develop solutions that improve resiliency and customer outcomes.
  • Participate in architecture reviews, design discussions, code reviews, testing, and engineering best practices.
  • Measure and optimize service performance, scalability, reliability, and operational effectiveness through data-driven improvements.
  • Help establish patterns and best practices for AI-enabled automation and autonomous operations that can be leveraged across future OCI initiatives.
  • Contribute to a culture of innovation, ownership, continuous learning, and operational excellence.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service