The Core Infrastructure Team is the backbone of Eightfold, responsible for the architecture, maintenance, and enhancement of critical elements of our technology stack. This includes infrastructure supporting our Search, Databases, Machine Learning, Data Warehouse, Developer Platform, and Application Infrastructure in AWS and Azure. This role is part of a team responsible for critical cloud infrastructure across AWS and Azure, ensuring global reliability, scalability, security, cost optimization, and adoption of Infrastructure as Code (IaC). We are looking for a Software Engineer/Site Reliability Engineer (SRE) with a strong focus on production infrastructure health and operation to join our Core Infra team. You will be responsible for the stability, uptime, incident response, and operational excellence of Eightfold's multi-region production environments across AWS and Azure. This role emphasizes system health, rather than feature development velocity. You will play a critical role in ensuring the reliability of the core infrastructure that underpins all customer-facing solutions. We've built strong foundations with automation, monitoring, and runbooks to ensure on-call is predictable and manageable. This role offers engineers valuable hands-on production support experience as well as experience for improving overall infrastructure health through observability, health metrics, and reliability engineering practices.