Infrastructure Engineer

Cerebras SystemsSunnyvale, CA
16dOnsite

About The Position

We are looking for a hands-on Infrastructure Engineer to join our team and support our high-performance, on-premise server and networking infrastructure. You will be responsible for maintaining, provisioning, and troubleshooting hardware and Linux systems, working closely with network and system teams. This is an in-person role, ideal for someone who enjoys working across hardware, networking, and system layers.

Requirements

  • 3–5+ years of experience in data center, lab, or infrastructure engineering roles.
  • Proficient in Linux system administration and network configuration.
  • Strong hands-on knowledge of x86 server hardware and enterprise networking.
  • Familiar with BIOS configuration, firmware updates, and remote management tools.
  • Skilled in physical setup and troubleshooting of high-speed NICs and optical links.
  • Experience with VLANs, static routing, and diagnosing layer 1–3 issues.
  • Ability to write scripts for automation and diagnostics (Bash, Python preferred).
  • Comfortable working on-site daily and lifting/moving server hardware.

Nice To Haves

  • Experience with PXE, NFS, RAID controllers, and monitoring tools.
  • Familiarity with configuration management tools (e.g., Ansible).
  • Prior experience in a lab or R&D hardware/software environment.

Responsibilities

  • Physically install, rack, cable, and maintain blade servers and hardware components (CPUs, DIMMs, NICs, storage devices, etc.).
  • Connect servers to high-speed networks (100G/400G), verify optics/DACs, and check link status.
  • Configure BIOS, firmware, and out-of-band management (IPMI/iDRAC/iLO).
  • Install and provision Linux OS; configure hostnames, IPs, routing, and NFS mount points.
  • Debug network issues at physical and OS level (VLAN, link issues, routing, etc.).
  • Use Linux tools (e.g., ip, dmesg, netstat, ping) to isolate and fix issues.
  • Follow provisioning playbooks and maintain accurate records of assets and changes.
  • Use scripting (Bash, Python) to automate routine tasks and improve efficiency.
  • Collaborate with internal teams (network, systems, storage) and coordinate vendor RMAs.
  • Document procedures and contribute to team knowledge base.
  • Troubleshoot and replace failed server components with minimal downtime.

Benefits

  • Build a breakthrough AI platform beyond the constraints of the GPU.
  • Publish and open source their cutting-edge AI research.
  • Work on one of the fastest AI supercomputers in the world.
  • Enjoy job stability with startup vitality.
  • Our simple, non-corporate work culture that respects individual beliefs.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service