Production Reliability Engineer II

Early Warning®Scottsdale, AZ
$66,000 - $82,000Hybrid

About The Position

This position is responsible for the stability, performance, and growth of key business platforms. The role involves close collaboration with various teams to ensure reliability, scalability, and performance requirements are met. It includes managing deployments, upgrades, patching, and certificate renewals across different environments (CAT, DR, PRD). The engineer will also be responsible for documentation, communication of impacting events, and identifying opportunities for automation and process improvement. A key aspect of the role is supporting the company's commitment to risk management and data confidentiality. For a Telco/Transmissions-focused position, responsibilities extend to managing client implementations related to external connectivity, creating network requests, and supporting file transmission software.

Requirements

  • Bachelor’s degree in Computer Science, Information Systems, or other related fields.
  • 2 - 5 years or more of related experience.
  • Knowledge of Information Technology Infrastructure Library (ITIL) and Information Technology Service Management (ITSM) disciplines, practices, and procedures.
  • Familiarity with Agile Methodology, development and project management requirements focused on system performance and reliability.
  • Ability to relate business needs to system capabilities and fully understand the role of the systems and impacts on the business.
  • Advanced knowledge of platform architecture systems and requirements.
  • Strong attention to detail and accuracy, policy compliance and documentation.
  • Ability to manage multiple priorities and perform strategic decision-making.
  • Ability to analyze problems and review multiple alternate solutions including analysis of advantages, potential risks and remediations.
  • Effective written and verbal communication with all levels of internal teams and/or external customers.
  • Background and drug screen.
  • Education and experience typically obtained through completion of a Bachelor’s Degree in Computer Science, Information Systems with experience supporting users and an ability to relate business needs to system capability OR Education and experience typically obtained through completion of various technology certifications in systems and networking.
  • Computer operations experience with knowledge of networking and distributed systems coupled with the ability to apply this toward business operations OR experience with solid technology capabilities and a comfort level with applications and batch jobs.
  • Knowledge of telecom/network functions and transmission protocols such as, TCP/IP, SFTP or Connect:Direct, Cisco or Juniper networks.

Nice To Haves

  • Knowledge and/or experience with customer implementations is a plus.
  • Amazon Cloud certification.

Responsibilities

  • Work closely with Production Reliability Engineers, Product Owners, Architecture, Security, Engineering, and other teams to collaborate on requirements, priorities, etc.
  • Provide input and guidance to platform projects to ensure reliability, scalability, current functionality, capacity, and performance requirements are not adversely impacted.
  • Documents, manages, and supports code deployments, hardware upgrades, patching, certificate renewals into the Customer Acceptance Testing (CAT), Disaster Recovery (DR) and Production (PRD) environments.
  • Perform the appropriate functional, regression, performance and other assisted testing.
  • Work closely with Product and Customer Enablement Team to maintain up to date topology diagrams and business workflow diagrams for supported platforms.
  • Review, provide clarification, update, and manage Business Review Documents (BRD) and Production Readiness Checklists (PRC).
  • Provide status report updates as requested by Project Managers and company leaders.
  • Prioritize, manage, and keep current on tasks, stories, and assignments within team Kanban board.
  • Identify, create, and publish customer communications regarding maintenance windows, code deployments, Disaster Recovery Exercises or any other potential customer impacting events within Customer Acceptance Testing (CAT), Disaster Recovery (DR) and Production (PRD) environments.
  • Identify areas where efficiencies and/or automation could: improve processes, reduce, or eliminate manual processes, reduce risk, and provide a better user experience.
  • Work with others to architect the solution, create the epics or stories required, prioritize, test, implement and measure the improvements.
  • Support the company commitment to risk management and protecting our integrity and confidentiality of systems and data.
  • Provide support on testing, timelines, and requirements for bank on-boarding.
  • Review changes in User Acceptance Testing (UAT) and participate in discussions for scheduling what changes will be deployed and into which environment.
  • Assist in triage and support incident or request tickets and their corresponding Service Level Agreements (SLA’s).
  • Identify, track, manage, and report on Service Level Agreements (SLA’s).
  • Assist in creation, standardization and use of Status Cast communications as needed for customer communication on incidents and/or deployments.
  • Manage client implementations related to external connectivity needs such as telecom, SSL certs, networking, and transmissions.
  • Create all necessary network connectivity requests such as telco setups, firewall requests, etc.
  • Follow transmission templates for automated file transfers between clients and production systems.
  • Support administration of File Transmission software applications, configuration files and various system components that support the movement and management of file deliveries.
  • Work with all other Early Warning departments to perform ad hoc, or to automate ongoing, file transfers between various production systems.
  • Complies with all security policies and procedures to ensure the highest level of integrity relative to data protection, confidential and sensitive information, and security within the workplace.

Benefits

  • Competitive medical (PPO/HDHP), dental, and vision plans
  • Company contributions to your Health Savings Account (HSA) or pre-tax savings through flexible spending accounts (FSA) for commuting, health & dependent care expenses.
  • 401(k) Retirement Plan – Featuring a 100% Company Safe Harbor Match on your first 6% deferral immediately upon eligibility.
  • Flexible Time Off for Exempt (salaried) employees, as well as generous PTO for Non-Exempt (hourly) employees.
  • 11 paid company holidays
  • Paid volunteer day
  • 12 weeks of Paid Parental Leave
  • Maven Family Planning – provides support through your Parenting journey including egg freezing, fertility, adoption, surrogacy, pregnancy, postpartum, early pediatrics, and returning to work.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service