Reliability Engineer (Remote)

Inspira FinancialOak Brook, IL
8hRemote

About The Position

The Reliability Engineer (RE) will report to the Reliability Engineering Manager in the Technology Department. RE will work closely with engineering, security, and infrastructure teams to ensure Inspira’s systems are highly available, scalable, and secure. RE will play a crucial role in deployments, incident response, system reliability, and performance optimization, while also contributing to long reliability -term infrastructure strategies. Working within a team environment, the RE participates directly IT realiabilityin solution creation, providing hands-on support as well as operational support and training. This individual must be creative, client focused, solutions-driven, organized, and have the ability to thrive in a dynamic environment.

Requirements

  • Minimum 3 years of experience in Information Technology
  • 3+ years of role specific experience
  • Minimum of 3 years of experience with: Experience with IaC tools such as Terraform, bash scripting, etc
  • Experience supporting Containerization Platforms such as K8s and Docker
  • Experience working with Automation tools such as ADO, Jenkins, and Chef
  • Experience working with Observability tools such as Datadog and Azure Monitor
  • Knowledge of principles such as SLIs, SLOs, and error budgets.
  • Familiarity with observability concepts beyond monitoring, such as distributed tracing and log correlation.
  • Knowledge of Virtual Machines and Container concepts.
  • Knowledge of Security as it relates to Cloud Environments including the Shared Security Model
  • Scripting languages such as Powershell, Bash, Python, etc.
  • Experience with Cloud Services Azure (preferred), Google or AWS
  • Experience with BDR solutions such as Veeam, VMWare Site Recovery, and Azure Backup/Site Recovery
  • Ability to work independently with minimal supervision
  • Must have excellent written and verbal communication skills
  • Strong analytical skills, follow-up capability, and problem-solving ability
  • Ability to conduct research into hardware and software issues and products as required
  • Ability to effectively prioritize and execute tasks in a high-pressure environment
  • Ability to use strong interpersonal and presentation skills to share ideas, solutions, and strong working relationships with business units including non-technical users, technical leads, and developers
  • Experience working with a ticketing system and internal clients
  • Ability to respond to emails and text messages after hours to resolve critical issues
  • Must possess strong skills in personal diplomacy and client service while consistently demonstrating a high level of motivation, commitment to teamwork, professionalism and trustworthiness
  • Strong vendor management skills
  • Highly self-motivated and directed
  • Ability to provide personal transportation from time to time.
  • Ability to work overtime.
  • Prolonged periods of sitting at a desk and working on a computer

Nice To Haves

  • Certifications preferred: AZ-900, Datadog Fundamentals
  • Experience in a high availability environment preferred
  • Knowledge of ITIL/ITSM practices and framework preferred

Responsibilities

  • Partner with the Engineering and Security teams to create, implement and apply SRE principles, processes, and controls.
  • Build & support Site Reliability function & participate in building tools to monitor and report system KPIs.
  • Monitoring of Platform and Environment with tools such as Datadog, Azure Monitor, etc.
  • Configure and Support the Disaster Recovery and Business Resumption Plan as it relates to the backup and restoration of the technology infrastructure. Ensure run books are updated on a regular basis
  • Utilize programming skills to design and develop programs or scripts for various repetitive functions
  • Contribute to long-term infrastructure strategies and reliability improvements.
  • Performs all duties with a focus on goals of Inspira, which includes risk mitigation
  • Support inbound calls/emails, maintaining tickets within the issue tracking application related to Infrastructure Support
  • Crosstrain other team members to facilitate coverage
  • Other duties as assigned
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service