Production Reliability Engineer II

Early Warning ServicesScottsdale, AZ
Hybrid

About The Position

This position is responsible for the stability, performance, and growth of key business platforms. The role involves close collaboration with various teams, including Production Reliability Engineers, Product Owners, Architecture, Security, and Engineering, to ensure reliability, scalability, and performance requirements are met. The engineer will manage and support code deployments, hardware upgrades, patching, and certificate renewals across different environments (CAT, DR, PRD), perform various types of testing, and maintain documentation such as topology and business workflow diagrams. A key aspect of the role is to identify and implement automation opportunities to improve processes, reduce risk, and enhance user experience. The position also requires supporting risk management, data integrity, and confidentiality, including managing Self-Identified-Issues (SIIs). For Telco/Transmissions-focused roles, responsibilities include managing client implementations related to external connectivity, creating network requests, and supporting file transmission software.

Requirements

  • Bachelor’s degree in Computer Science, Information Systems, or other related fields.
  • 2 - 5 years or more of related experience.
  • Knowledge of Information Technology Infrastructure Library (ITIL) and Information Technology Service Management (ITSM) disciplines, practices, and procedures.
  • Familiarity with Agile Methodology, development, and project management requirements focused on system performance and reliability.
  • Ability to relate business needs to system capabilities and fully understand the role of the systems and impacts on the business.
  • Advanced knowledge of platform architecture systems and requirements.
  • Strong attention to detail and accuracy, policy compliance, and documentation.
  • Ability to manage multiple priorities and perform strategic decision-making.
  • Ability to analyze problems and review multiple alternate solutions, including analysis of advantages, potential risks, and remediations.
  • Effective written and verbal communication with all levels of internal teams and/or external customers.
  • Background and drug screen.
  • For Telco/Transmissions-focused: Education and experience typically obtained through completion of a Bachelor’s Degree in Computer Science, Information Systems with experience supporting users and an ability to relate business needs to system capability OR Education and experience typically obtained through completion of various technology certifications in systems and networking OR Computer operations experience with knowledge of networking and distributed systems coupled with the ability to apply this toward business operations OR experience with solid technology capabilities and a comfort level with applications and batch jobs.
  • Knowledge of telecom/network functions and transmission protocols such as, TCP/IP, SFTP or Connect:Direct, Cisco or Juniper networks (for Telco/Transmissions-focused).

Nice To Haves

  • Knowledge and/or experience with customer implementations is a plus.
  • Amazon Cloud certification.

Responsibilities

  • Work closely with Production Reliability Engineers, Product Owners, Architecture, Security, Engineering, and other teams to collaborate on requirements and priorities.
  • Provide input and guidance to platform projects to ensure reliability, scalability, current functionality, capacity, and performance requirements are not adversely impacted.
  • Document, manage, and support code deployments, hardware upgrades, patching, and certificate renewals into the Customer Acceptance Testing (CAT), Disaster Recovery (DR), and Production (PRD) environments.
  • Perform functional, regression, performance, and other assisted testing.
  • Work closely with Product and Customer Enablement Teams to maintain up-to-date topology diagrams and business workflow diagrams for supported platforms.
  • Review, provide clarification, update, and manage Business Review Documents (BRD) and Production Readiness Checklists (PRC).
  • Provide status report updates as requested by Project Managers and company leaders.
  • Prioritize, manage, and keep current on tasks, stories, and assignments within the team Kanban board.
  • Identify, create, and publish customer communications regarding maintenance windows, code deployments, Disaster Recovery Exercises, or any other potential customer-impacting events.
  • Identify areas where efficiencies and/or automation could improve processes, reduce manual processes, reduce risk, and provide a better user experience.
  • Work with others to architect solutions, create epics or stories, prioritize, test, implement, and measure improvements.
  • Support the company commitment to risk management and protecting the integrity and confidentiality of systems and data, including managing Self-Identified-Issues (SIIs).
  • Provide support on testing, timelines, and requirements for bank on-boarding.
  • Review changes in User Acceptance Testing (UAT) and participate in discussions for scheduling deployments.
  • Assist in triage and support incident or request tickets and their corresponding Service Level Agreements (SLAs).
  • Identify, track, manage, and report on Service Level Agreements (SLAs).
  • Assist in the creation, standardization, and use of Status Cast communications for customer communication on incidents and/or deployments.
  • Manage client implementations related to external connectivity needs such as telecom, SSL certs, networking, and transmissions (for Telco/Transmissions-focused position).
  • Create necessary network connectivity requests such as telco setups and firewall requests (for Telco/Transmissions-focused position).
  • Follow transmission templates for automated file transfers between clients and production systems (for Telco/Transmissions-focused position).
  • Support administration of File Transmission software applications, configuration files, and various system components (for Telco/Transmissions-focused position).
  • Work with other Early Warning departments to perform ad hoc or automate ongoing file transfers between various production systems (for Telco/Transmissions-focused position).
  • Comply with all security policies and procedures to ensure the highest level of integrity relative to data protection, confidential and sensitive information, and security within the workplace.

Benefits

  • Competitive medical (PPO/HDHP), dental, and vision plans
  • Company contributions to your Health Savings Account (HSA) or pre-tax savings through flexible spending accounts (FSA) for commuting, health & dependent care expenses.
  • 401(k) Retirement Plan with a 100% Company Safe Harbor Match on your first 6% deferral immediately upon eligibility.
  • Flexible Time Off for Exempt (salaried) employees, as well as generous PTO for Non-Exempt (hourly) employees.
  • 11 paid company holidays.
  • A paid volunteer day.
  • 12 weeks of Paid Parental Leave.
  • Maven Family Planning support (egg freezing, fertility, adoption, surrogacy, pregnancy, postpartum, early pediatrics, and returning to work).
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service