Staff Linux System Administrator

Micron TechnologyBoise, ID
3d

About The Position

Own the administration, reliability, and performance of Linux server environments supporting 24×7 manufacturing systems Diagnose and resolve complex issues spanning Linux OS, virtualization, containers, networking, and hardware Design, implement, and test Business Continuity and Disaster Recovery (BC/DR) strategies and runbooks Drive operational excellence through proactive monitoring of system health, capacity, and performance metrics Automate operational tasks related to monitoring, alerting, configuration management, and reporting Serve as a technical leader and resource for critical issues and provide mentoring to L1/L2 teams in a global support model Collaborate with architects and leadership to implement scalable, resilient infrastructure solutions

Requirements

  • 5+ years of Linux system administration experience in large enterprise or production environments
  • Advanced expertise in identifying and resolving Linux performance issues, performance tuning, and security hardening
  • Hands‑on experience with Kubernetes platforms (OpenShift or OKD)
  • Demonstrable experience with Ansible and Red Hat Ansible Automation Platform
  • Working knowledge of scripting and automation using Bash/Shell and Python
  • Experience with virtualized or hyper‑converged infrastructure (VMware, Nutanix, or equivalent)
  • Good communication skills and ability to operate independently in a global environment
  • Experience supporting manufacturing or critical production systems
  • Prior participation in on‑call rotations and major incident response
  • Experience contributing to infrastructure standards, procedures, and documentation

Nice To Haves

  • Hands‑on experience with Red Hat Advanced Cluster Management (RHACM) and RHACS
  • Familiarity with enterprise monitoring and observability tools (e.g., Splunk)
  • Experience with enterprise backup and recovery platforms (e.g., Cohesity)
  • Experience supporting enterprise server hardware (Dell, Lenovo, or equivalent)

Responsibilities

  • Own the administration, reliability, and performance of Linux server environments supporting 24×7 manufacturing systems
  • Diagnose and resolve complex issues spanning Linux OS, virtualization, containers, networking, and hardware
  • Design, implement, and test Business Continuity and Disaster Recovery (BC/DR) strategies and runbooks
  • Drive operational excellence through proactive monitoring of system health, capacity, and performance metrics
  • Automate operational tasks related to monitoring, alerting, configuration management, and reporting
  • Serve as a technical leader and resource for critical issues and provide mentoring to L1/L2 teams in a global support model
  • Collaborate with architects and leadership to implement scalable, resilient infrastructure solutions
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service