About The Position

Providing for loved ones, planning rewarding retirements, saving enough for whatever lies ahead – our policyholders count on us to be there when it matters most. It’s a big ask, but it’s one that we have the power to deliver when we work together. We collaborate and innovate – pushing one another to transform not just Pacific Life, but the entire industry for the better. Why? Because it’s the right thing to do. Pacific Life is more than a job, it’s a career with purpose. It’s a career where you have the support, balance, and resources to make a positive impact on the future – including your own. We’re actively seeking a talented to join our Site Reliability Engineering team in Newport Beach, CA • This role is hybrid. We believe in empowering our employees to get work done both in and out of the office. As a Sr Consultant - AI engineering in SRE, you will operate as a cross-functional technical leader, shaping engineering blueprint for platform and solution, mentoring senior and principal engineers, and embedding reliability-first principles into every phase of the software and platform lifecycle—with a strong focus on Generative AI maturity and augmentation as it applied to enhancing Software engineering practice and SRE

Requirements

  • 20+ years of comprehensive engineering experience, including more than 5 years in senior technical leadership roles guiding teams and shaping strategic direction.
  • Demonstrated track record of successfully scaling reliability practices across highly regulated sectors such as insurance, finance, and healthcare, ensuring compliance with industry standards.
  • Advanced expertise in Site Reliability Engineering (SRE), distributed systems architecture, cloud infrastructure management, and ensuring production environments are robust and ready for scale.
  • Hands-on experience implementing and integrating GenAI frameworks to enhance reliability engineering, with practical insights into their deployment and optimization.
  • In-depth knowledge of DevOps methodologies, observability tools, and compliance frameworks including SOC2, ISO 27001, and NIST, ensuring technology solutions meet rigorous regulatory and operational requirements.
  • Outstanding communication, stakeholder engagement, and executive-level presentation skills, enabling effective collaboration across cross-functional teams and with senior leadership.

Responsibilities

  • Reliability Engineering & Production Readiness Design and Scale Reliability Frameworks: Design comprehensive reliability frameworks that support both cloud-native and legacy environments, ensuring seamless integration of best practices across diverse technology stacks. Develop standardized processes and methodologies that enable teams to proactively identify, measure, and mitigate reliability risks at every stage of the software development lifecycle. Collaborate with business stakeholders to establish clear Service Level Agreements (SLAs), Service Level Objectives (SLOs), and error budgets tailored to each platform and service. Align technical reliability metrics with business priorities, enabling data-driven decisions that balance innovation velocity with risk mitigation. Continuously monitor and refine these metrics to ensure they reflect evolving customer expectations and operational realities. Develop and institutionalize robust incident response protocols, ensuring rapid detection, escalation, and resolution of production issues. Foster a blameless postmortem culture that emphasizes learning, transparency, and accountability, driving continuous improvement in incident management processes. Lead cross-functional review sessions to capture actionable insights and implement preventative measures. Advocate for comprehensive observability by implementing advanced monitoring, logging, and tracing solutions that provide deep visibility into platform health and user experience. Drive performance engineering initiatives to proactively optimize system throughput, latency, and resource utilization. Design and enforce fault-tolerant architectures, leveraging redundancy, failover, and self-healing mechanisms to ensure uninterrupted service delivery. Lead the adoption of automation across deployment pipelines, monitoring systems, and recovery operations to minimize manual intervention and reduce operational overhead.
  • Cross Functional collaboration and foster best practices Organize workshops, training sessions, and knowledge-sharing forums to disseminate reliability engineering principles and best practices throughout the organization. Mentor engineers at all levels, cultivating a reliability-first mindset and empowering teams to take ownership of platform resilience. Partner closely with software engineering, operations, product management, and business leaders to ensure reliability goals are embedded into every phase of product development and delivery. Facilitate regular sync-ups to review reliability metrics, share insights, and align strategic priorities.
  • GenAI-Augmented SRE Practices Integrate Generative AI into core Site Reliability Engineering (SRE) workflows to elevate operational efficiency, resilience, and scalability across the organization. By embedding advanced AI capabilities, teams can leverage automation and intelligent insights to drive continuous improvement and reduce manual toil. These capabilities include but are not limited to: AI-Assisted Incident Triage and Root Cause Analysis Intelligent Alerting and Noise Reduction Automated Runbook Generation and Remediation Scripting Predictive Reliability Modeling Evaluate and deploy a suite of GenAI tools to automate repetitive tasks, streamline troubleshooting, and enhance overall operational efficiency. Conduct regular assessments of tool performance and integration effectiveness to ensure alignment with reliability goals and evolving business requirements. Partner closely with platform and data engineering teams to build robust feedback loops between GenAI models and production telemetry. Evaluate and deploy GenAI tools (e.g., LangChain, Semantic Kernel, Copilot) to improve operational efficiency and reduce toil.
  • Strategic Engineering Leadership Guide and support senior and consultant level engineers by nurturing a work environment dedicated to resilience, accountability, and the pursuit of engineering excellence. Encourage ongoing learning, proactive problem-solving, and knowledge sharing to strengthen team capabilities. Work in close partnership with infrastructure, product, and compliance teams to ensure that reliability strategies are fully integrated with broader enterprise objectives. Facilitate cross-functional collaboration to proactively address potential challenges and maintain alignment across departments. Direct rigorous technical evaluations for platform enhancements, vendor selection processes, and other strategic initiatives. Oversee due diligence to guarantee that technology upgrades and partnerships meet high standards for reliability, scalability, and compliance.

Benefits

  • Prioritization of your health and well-being including Medical, Dental, Vision, and a Wellbeing Reimbursement Account that can be used on yourself or your eligible dependents
  • Generous paid time off options including Paid Time Off, Holiday Schedules, and Financial Planning Time Off
  • Paid Parental Leave as well as an Adoption Assistance Program
  • Competitive 401k savings plan with company match and an additional contribution regardless of participation

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service