About The Position

This job is responsible for the reliability, availability, and performance of critical healthcare IT systems, principally in the Environment of Care (EOC), enabling seamless access to essential services for patients, providers, and the people we serve. Proactively identifies and mitigates potential disruptions to maintain the highest standards of care and operational efficiency. This role blends software engineering, clinical engineering, and security principles with a deep understanding of healthcare operations to minimize downtime, improve system resilience, and to support clinical workflows and continuity of hospital operations. Works cross-functionally with AHN site leaders and teams to navigate and to monitor and support building automation and facility systems, clinical engineering / IoT, healthcare delivery technology architecture, infrastructure and platform operations, and cybersecurity. Fosters a culture of automation, continuous improvement, collaboration, and patient safety. Develops core metrics for monitoring and maintaining system health for SRE practitioners (e.g., latency, traffic, errors, and saturation) leveraging industry practices, manufacturer guidance, and other service delivery metrics.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, Management Information Systems, IT, or related field or relevant experience and/or education as determined by the company in lieu of bachelor's degree.
  • 3 years with Management or leadership role
  • 5 years of experience with Site Reliability Engineering (SRE), Systems Administration, or DevOps particularly in healthcare IT
  • 5 years of experience in Medical device management lifecycle, network / device segmentation, vulnerability and patch management
  • 5 years of experience in Healthcare IT experience in architecture, automation, IoT, telemetry, telehealth, security, system development lifecycle, capacity planning, networking, continuous integration / continuous delivery pipelines (CI/CD), incident management, scripting, metrics, monitoring, redundancy, etc.
  • 3 years of experience working in highly regulated environments
  • 3 years of experience with Progressive leadership roles, preferably in clinical engineering, IT, business continuity, backup and storage management, building automation, or cybersecurity discipline in healthcare
  • Problem-Solving: Excellent analytical and troubleshooting skills; High capacity to think analytically, interpret information / observations, apply judgment and to assist with making effective, strategic decisions.
  • Collaboration: Ability to work effectively in a team environment; demonstrated ability to support multiple sites and locations while maintaining consistency in service delivery processes and procedures.
  • Communication: Strong written and verbal communication skills.
  • Flexibility: Willingness to participate in activities or incidents which may occur outside of regular work schedules.
  • Leadership: Demonstrated resource and project planning capabilities, decision making skills, history of results-oriented delivery, and effective team building across multiple locations and a diverse team of staff, partners, and stakeholders.
  • Security Awareness: Understanding of security best practices and how to apply them in a healthcare IT environment.
  • Delivery and Execution: Demonstrated competency in the execution of multiple projects, including managing resources across multiple projects to meet goals.
  • Relationships: Strong relationship building skills and ability to influence with and without authority in a matrixed organization.

Nice To Haves

  • Master’s degree in Computer Science, Engineering, Management Information Systems, IT, or related field

Responsibilities

  • Perform management responsibilities to include, but are not limited to: involved in hiring and termination decisions, coaching and development, rewards and recognition, performance management and staff productivity.
  • Plan, organize, staff, direct and control the day-to-day operations of the department; develop and implement policies and programs as necessary; may have budgetary responsibility and authority.
  • Oversees the partnership with clinical engineering, cybersecurity, device manufacturers, suppliers, and Information Technology SMEs to oversee and to implement strategies for managing, monitoring, and securing a diverse range of clinical devices and other technology equipment (e.g., IoT), ensuring compliance with HIPAA and other relevant regulations (e.g., FDA, TJC, PCI).
  • Keeps current on healthcare IT trends, including AI, security patching, and best practices for device hardening.
  • Oversees and assists with network segmentation and access controls to isolate and to protect clinical and other critical devices.
  • Automates monitoring tasks to improve efficiency and reduce errors.
  • Identifies and remediates vulnerabilities in clinical devices and related infrastructure.
  • Manages and reports issues with assets, devices, integration services, and other equipment.
  • Engages the appropriate parties to develop and deploy a fix/solution or oversees ownership of resolution actions.
  • Utilizes observability practices to gain deep insights into system behavior, enabling faster identification and resolution of issues.
  • Oversees the SRE partnership with Clinical Engineering and Cybersecurity Engineering to troubleshoot technical issues related to medical equipment and systems.
  • Participates in the medical device technology lifecycle – from product/device evaluation, discovery, to implementation, maintenance, and through retirement.
  • Develops the framework and structure to maintain documentation related to the IT infrastructure supporting clinical and other critical devices.
  • Participates in the planning and oversees the execution of preventative maintenance activities.
  • Provides direction and guidance to team members on how to analyze complex problems and develop effective solutions, how to troubleshoot system outages and performance issues, and how to work collaboratively with other IT, cybersecurity, facility, AI and application teams to resolve issues and to conduct root cause analyses.
  • Oversees the SRE partnership with facility leaders to optimize the performance and monitoring of building automation systems (BAS), including HVAC, lighting, fire suppression, security systems, etc.
  • Manages processes and procedures used to monitor BAS performance metrics and proactively identifies potential issues.
  • Works with facilities management to implement improvements to the BAS infrastructure.
  • Works with cybersecurity, vendors/manufacturers, et. al. to ensure the security of building automation systems and oversees monitoring of performance, service delivery, and support.
  • Oversees the SRE partnership with IT teams including, but not limited to platform / product management, disaster recovery services, infrastructure and architecture, storage management, and release management.
  • Participates in the planning and execution of downtime drills and system / device recovery exercises.
  • Supports other emergency preparedness drills and exercises, as needed.
  • Leads or participates in post-incident reviews to identify root causes and implement corrective actions.
  • Works with cross-functional stakeholders to Implement and to maintain redundant systems and failover mechanisms to minimize downtime.
  • Reviews and provides feedback on emergency operations plans and other materials which are used to respond to emergency situations (e.g., Continuity of Operations Plans, Incident Response Guides, Downtime Procedures).
  • Manages team members who are supporting the planning and execution of system migrations, releases, and upgrades to ensure minimal disruption to clinical operations.
  • Oversees detailed migration or installation plans, including risk assessments, rollback procedures, and communication strategies.
  • Assists local site leaders with navigating shared services (e.g., AI, IT, Information Security, Clinical Engineering, Platform Operations, Technology Acquisition).
  • Establishes core metrics for monitoring and maintaining system health for SRE practitioners (e.g., latency, traffic, errors, and saturation).
  • Manages the processes and procedures used for documentation and knowledge sharing including maintaining detailed documentation of systems, device inventories, processes, and procedures.
  • Leads by example by sharing knowledge and best practices with other staff and cross-functional teams.
  • Provides training and mentorship to junior or less experienced team members.
  • Stays current with the latest technologies and trends in site reliability engineering.
  • Leads or participates in briefings with cross-functional stakeholders to manage priorities and team assignments, support ticket queues, etc.
  • Other duties as assigned or requested.

Benefits

  • Highmark Health and its affiliates prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities and prohibit discrimination against all individuals based on any category protected by applicable federal, state, or local law.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service