As a Senior Network Operations Engineer at Together AI, you are our front-line responder for break/fix incidents-owning alert triage, collaborating with SRE and MLOps teams, and driving rapid resolution to keep our global network and platform running smoothly. You combine strong operational discipline with hands-on troubleshooting and a bias for automation. Beyond traditional networking, you'll work hands-on with Kubernetes and Slurm to diagnose issues that span infrastructure, container networking, and HPC job fabrics. You're fluent in routing/switching and network security fundamentals, comfortable on Linux, and thrive in fast-moving environments where clear communication and crisp execution matter. You'll improve monitoring, runbooks, and recovery playbooks to reduce MTTA/MTTR and prevent repeat incidents. Outstanding problem-solving abilities and a solid understanding of fundamental network theory are also critical to your success.