SiteOps Production Operations Engineer

Meta•GA

98d

About The Position

Meta Platforms, Inc. (Meta), formerly known as Facebook Inc., builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps and services like Messenger, Instagram, and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. To apply, click “Apply to Job” online on this web page.

Requirements

Requires a Master’s degree (or foreign equivalent) in Computer Science, Computer Software, Computer Engineering, Telecommunications or related field.
Requires completion of a graduate-level course, research project or internship involving Linux (or equivalent OS) in a complex IT environment.
Ability to triage, debug, and troubleshoot complex, systemic issues.
Knowledge of server hardware and components, including storage.
Understanding of interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, and network.
Experience managing multiple technical issues concurrently driving to the root cause.
Participation in or leading technical projects such as process improvement, technology, or automation.
Familiarity with HTTP, DNS, RAID, and DHCP.
Experience providing technical guidance to external vendors.
Ability to debug, modify and develop scripting or programming languages in at least one of these languages: Bash, PHP, Python, SQL, Rust, Go or Perl.
Knowledge of out-of-band/lights-out server communication methods, including IPMI and serial console.
Experience using data and metrics to drive decisions.

Responsibilities

Support platform health by successfully resolving and closing complex tickets, while addressing the overall issue including root cause.
Perform deep dives and root cause analysis of complex technical issues within the data center.
Facilitate collaboration with cross-functional teams on projects and initiatives related to process, hardware, and automation.
Lead the introduction of new platforms and hardware to the site and geographical area.
Use tools and data analysis effectively to identify larger scope issues impacting one or multiple Data Centers.
Drive corrective actions of complex hardware issues and influence future design changes.
Solve complex and systemic hardware and/or software issues at scale using scripting, automation, and tooling.
Continuously evaluate and identify areas for improvement in processes, tools, and systems.
Use data analytics to drive maximum server up-time and utilization rates.
Coach and mentor team members to evaluate and identify better ways to resolve issues.
Provide engineering support and be a go-to technical resource and Subject Matter Expert.
Maintain and update documentation including procedures, runbooks, and guides.
Build cross-functional relationships and influence policies and procedures that improve global data center operations.
Participate in 24/7 on-call rotation.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Education Level

Master's degree

SiteOps Production Operations Engineer

About The Position

Requirements

Responsibilities

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company