About The Position

Our team members are the heart of what makes us better. At Hackensack Meridian Health we help our patients live better, healthier lives — and we help one another to succeed. With a culture rooted in connection and collaboration, our employees are team members. Here, competitive benefits are just the beginning. It’s also about how we support one another and how we show up for our community. Together, we keep getting better - advancing our mission to transform healthcare and serve as a leader of positive change. The Information Technology (IT) Monitoring & Observability Engineer III will partner with the Hackensack Meridian Health (HMH) business leaders in designing, developing, and maintaining automated solutions to enhance monitoring, alerting, and system remediation in a hybrid healthcare IT environment as part of Digital Technology Services (DTS) within HMH. Leverages deep expertise in Python, PowerShell, and automation frameworks, such as Ansible/Terraform. Key focus on proactively preventing outages by optimizing monitoring platforms, such as Datadog and SolarWinds, managing database configurations, and analyzing system data. Mentors junior team members and serves as an escalation point for complex technical challenges, ensuring all solutions adhere to security and compliance standards.

Requirements

  • Bachelor's degree in computer science, Information Technology, Engineering, or a related field; or equivalent related experience (4 years).
  • Minimum of 7+ (11+ if no degree) years of experience in IT automation and monitoring in a hybrid cloud/on-premise environment.
  • Strong knowledge of monitoring and observability platforms such as Datadog, SolarWinds, Google Cloud Monitoring, Nagios, New Relic, and Elk.
  • Proficiency in Python scripting for automation, data processing, and API integrations.
  • Familiarity with cloud platforms (i.e., Google Cloud, Azure, AWS) and their monitoring services.
  • Strong understanding of IT infrastructure, networking, system logs, and security monitoring principles.
  • Detailed knowledge of Database Schemas and SQL.
  • Hands-on experience with IT automation tools such as Ansible, Terraform, or PowerShell scripting.
  • Knowledge of containerization and orchestration (i.e., Docker, Kubernetes) is a plus.
  • Fundamental knowledge of secure coding practices.
  • Understanding Observability Architecture and Design.
  • Strong analytical and troubleshooting skills in a complex IT environment.
  • Ability to work independently and collaboratively in a fast-paced, regulated environment.
  • Excellent communication and collaboration skills to work across IT, security, and business teams.
  • Proficient computer skills that include but are not limited to Google Suite and/or Microsoft Office platforms.
  • Certification in a monitoring suite(s), such as SolarWinds, DataDog, Dynatrace, or Google Cloud Platform (GCP).

Nice To Haves

  • Project Management background.
  • Certification(s) in Cloud Developer and/or Cloud DevOps areas, such as AWS Certified DevOps Engineer, Google Professional Cloud DevOps, Microsoft Certified DevOps Engineer.
  • Project Management Professional (PMP) and/or Project Management Institute (PMI) certifications.

Responsibilities

  • Develop and implement automation workflows for monitoring, alerting, and remediation using Python, PowerShell, Java, APIs, automation tools (i.e., SolarWinds, DataDog, ServiceNow), and any other platforms as appropriate.
  • Automate the systems remediation process to reduce manual interventions and improve response times for infrastructure and application issues.
  • Create and manage intelligent alerting mechanisms across cloud, on-premises, and hybrid environments to enhance system observability.
  • Integrate monitoring solutions with IT service management (ITSM) tools (i.e., ServiceNow) to streamline incident response and resolution.
  • Optimize and maintain monitoring platforms (i.e., Solarwinds, DataDog, Google Cloud Monitoring).
  • Develop custom scripts and Application Programming Interfaces (API) integrations to extend the capabilities of existing monitoring tools and automate key processes.
  • Design and implement self-healing automation for common IT incidents, ensuring minimal downtime for critical healthcare systems.
  • Configure and manage databases using advanced abilities.
  • Collaborate with DTS Operations, Cloud Operations, and Infrastructure teams to enhance monitoring and automation capabilities across hybrid environments.
  • Analyze system performance trends and log data to identify areas for optimization and preventative maintenance.
  • Ensure compliance with healthcare security and regulatory requirements, such as the Health Insurance Portability and Accountability Act (HIPAA) and Health Information Trust Alliance (HITRUST), in monitoring and automation solutions.
  • Serve as the first point of escalation for questions and complex problems from junior team members.
  • Other duties and/or projects as assigned.
  • Adheres to HMH Organizational competencies and standards of behavior.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service