Infrastructure Solution Engineer

Allstate•USA - TX (Remote), TX

2d•$75,100 - $126,325•Hybrid

About The Position

The Infrastructure Solutions Engineer for NOC Operations is an experienced technical contributor in our Standard Incident service team, providing skilled monitoring, troubleshooting, and resolution of infrastructure incidents across our enterprise technology ecosystem. As an integral contributor to our “Zero Wait” customer obsession initiative, this role delivers rapid and effective response to system alerts, ensuring reliable performance of Allstate’s critical infrastructure. With solid proficiency in UNIX, Backup and Storage environments, this position also demonstrates versatility across Windows, Nutanix, Azure, and AWS platforms. Working within our product-centric operating model, the Infrastructure Solutions Engineer applies technical depth and operational discipline to collaborate with Digital Product Teams (DPTs) and other Service Teams to improve service delivery, reduce friction points, and contribute to automation solutions that enhance system reliability and customer experience.

Requirements

5+ years of experience in technical operations, IT support, or system administration
Solid working knowledge of enterprise systems Linux (RedHat), Backup (Rubrik), Cloud, and Windows, Nutanix.
Strong understanding of incident management processes, event management frameworks, and service delivery fundamentals
Strong troubleshooting and analytical skills with the ability to work through complex and unfamiliar technical issues
Effective communication abilities, particularly during high-pressure situations and when coordinating across teams
Experience with ServiceNow reporting or similar ITSM/event management platforms
Working understanding of automation concepts and tools (GitHub, Ansible, Jenkins)
Scripting skills (Bash, Python, PowerShell) with the ability to contribute to automation solutions
Proficiency with monitoring tools (Datadog, Azure Data Explorer (ADX)) including dashboard usage and alert interpretation

Responsibilities

Act as a technical escalation point for incidents across Linux (Red Hat), backup (Rubrik), and storage platforms, supporting and guiding junior analysts.
Provide reliable incident response support during critical outages, coordinating with team members and escalating to senior analysts or engineering when appropriate
Mentor and support Service Analyst I team members, offering technical guidance and promoting knowledge development
Build and maintain solid technical expertise across multiple infrastructure domains
Contribute to team knowledge sharing through documentation, SOPs, and participation in training sessions
Proactively monitor and respond to alerts across multiple technology stacks, maintaining strong Mean Time to Acknowledge (MTTA) and Mean Time to Resolve (MTTR) metrics
Apply “Zero Wait” principles in incident response, taking immediate action on Emergency Command Center (ECC) calls without waiting to be prompted
Follow and improve Standard Operating Procedures (SOPs) while identifying opportunities to enhance processes and automate repetitive tasks
Support shift turnover meetings to ensure seamless 24/7 operational coverage and effective knowledge transfer
Actively contribute to the Service Improvement Backlog (SIB) with ideas that meaningfully enhance service delivery and reduce customer friction points
Provide Level 2 support for enterprise infrastructure systems including Linux (RedHat), and Backup (Rubrik) technologies.
Execute incident remediation following documented procedures while exercising sound technical judgment during complex or ambiguous scenarios
Participate in post-incident Retrospective reviews and problem management activities, contributing to root cause analysis and prevention of recurring issues
Collaborate with engineering teams to implement and test system changes, providing operational perspective and supporting readiness validation
Utilize monitoring tools including Netcool, Tivoli, Prism Element / Central, Datadog, and Azure Data Explorer (ADX) to identify, diagnose, and resolve system issues
Analyze incident patterns and trends to identify automation opportunities that reduce manual intervention and improve service outcomes
Develop and maintain quality knowledge base articles and SOP documentation, ensuring accuracy and usability for the broader NOC team
Participate in service reviews and demo sessions with technology partners, representing NOC operations effectively
Support the team’s KPI goals through consistent, high-quality service delivery and a continuous improvement mindset
Contribute meaningfully to the NOC automation pipeline by identifying, documenting, and advocating for automation of repetitive tasks

Benefits

Comprehensive technology setup, including a laptop, monitors, headset, keyboard, and mouse.
Monthly connectivity reimbursement for eligible remote employees.
Dedicated, private workspace free from distractions, along with appropriate desk and seating (for remote work).
Reliable internet with minimum speeds of 50 MB download and 5 MB upload (for remote work).

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Infrastructure Solution Engineer

About The Position

Requirements

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Similar Infrastructure Solution Engineer job opportunities

Tools

Templates & Examples

Resources

Comparisons

Company