Site Reliability Technical Lead

JabilFlorence, KY
5d

About The Position

At Jabil we strive to make ANYTHING POSSIBLE and EVERYTHING BETTER. We are proud to be a trusted partner for the world's top brands, offering comprehensive engineering, manufacturing, and supply chain solutions. With over 50 years of experience across industries and a vast network of over 100 sites worldwide, Jabil combines global reach with local expertise to deliver both scalable and customized solutions. Our commitment extends beyond business success as we strive to build sustainable processes that minimize environmental impact and foster vibrant and diverse communities around the globe. Jabil is a product solutions company providing comprehensive design, manufacturing, supply chain, and product management services. Operating from over 100 facilities in 29 countries, Jabil delivers innovative, integrated, and tailored solutions to customers across a broad range of industries and end-markets, such as automotive, consumer lifestyle and wearable tech, defense and aerospace, connected home and building, industrial and energy, enterprise and infrastructure, healthcare, mobility, packaging and printing. How will you make an impact? As a Site Reliability Technical Lead, within Jabil’s Cloud Test Software Development team, you will directly contribute to the daily operations and ongoing development of our Cloud Test Platform Infrastructure deployed across multiple global production facilities. You will provide a first-line response to production issues, including outages, end-user performance concerns, change management, and monitoring, while driving continuous improvement of production test infrastructure and applications. As the Site Reliability Technical Lead, you will also ensure site software and hardware remain current to maintain high availability and lead cross-training initiatives to strengthen team capability, knowledge sharing, and operational resilience.

Requirements

  • Experience in the following programming/scripting languages: Python Java, BASH C, C++, experience a plus
  • Understanding of Linux fundamentals: Ubuntu
  • Familiarity with hardware and API solutions for controlling, managing, and stressing L10 devices (servers, network, and storage SSDs, NVMe): IPMI Redfish, mprime, FIO, Linpack, ptugen, memtester
  • Experience with leading edge networking systems, hardware, software, and protocols including but not limited to enterprise ethernet datacenter switching/routing L1, L2, and L3 (BGP, DHCP Relay, ECMP). Arista CloudVision is a plus.
  • Experience with networking systems, hardware, software, and protocols including but not limited to enterprise ethernet datacenter switching/routing (L1 – L3).
  • Demonstrated ability to lead communication of manufacturing test infrastructure improvements and deliver consolidated, executive-level reporting on site performance, uptime risks, reliability trends, and mitigation strategies.
  • Bachelor’s degree in electrical, Computer Engineering, Computer Science, or related field.
  • 1-3 years of software engineering and/or IT operations and infrastructure experience.
  • Excellent verbal and written communication skills.
  • Experience working in multi-site and multi-cultural environments.
  • Domestic and/or international travel, up to 10%, may be needed.

Nice To Haves

  • Familiarity in the creation and configuration (DHCP, PXE boot, nginx) of Virtual Machines (VMs) using VMWare is a plus.

Responsibilities

  • Sustaining support and maintenance for the manufacturing server (L10) and rack (L11-L12) level test software and infrastructure deployed at our production facilities.
  • Support the site’s manufacturing server (L10) and rack (L11-L12) current test infrastructure as well as future expansions planning, deployments, and assembly.
  • Maintain manufacturing server (L10) and rack (L11-L12) test infrastructure documentation of installations, upgrades, management.
  • Communicate manufacturing test infrastructure improvements and provide leadership with combined reporting on site performance, uptime risks, and reliability trends.
  • Support manufacturing test incident response, analysis, and corrective actions for the site operations.
  • Participate in closed loop analysis/responses to factory test failures.
  • Perform scheduled preventive maintenance on the test infrastructure, including MDF, IDF, and SUT TORs.
  • Administer, upgrade, maintain, and continuously improve the enterprise monitoring platform to ensure initiative-taking alerting, system visibility, and high availability of factory IT/OT and production-critical systems.

Benefits

  • Medical, Dental, Prescription Drug, and Vision Insurance with HRA and HSA options
  • 401K Match
  • Employee Stock Purchase Plan
  • Paid Time Off
  • Tuition Reimbursement
  • Life, AD&D, and Disability Insurance
  • Commuter Benefits
  • Employee Assistance Program
  • Pet Insurance
  • Adoption Assistance
  • Annual Merit Increases
  • Community Volunteer Opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service