Responsibilities Monitor datacenter infrastructure health, capacity, and performance by proactively identifying risks, inefficiencies, or failure points and responding to issues as needed Work with our datacenter suppliers and vendors to procure hardware and services for replacement, expansion, and lifecycle management Perform day to day administrative, management, and configuration tasks of datacenter infrastructure, including servers, networking, power, and supporting systems Engage with internal staff and support to understand platform use and shortcomings Analyze infrastructure and platform usage to define, document, and promote best practices, standards and reusable templates. Contribute to discussions and decisions on the directions we are taking and ongoing projects/tasks Create and maintain technical documentation, runbooks, and operational processes to support scalability and knowledge sharing. Identify opportunities to automate operational tasks using scripting and tooling to reduce manual effort and error. Continuously improve monitoring coverage and alert quality to reduce noise and improve signal. Support vulnerability remediation and patching efforts across infrastructure components Standardize operational procedures to improve consistency and reduce operational risk Serve as a technical point of escalation for resolving issues affecting our platform, driving issues to resolution Develop, maintain and enhance monitoring, alerting and observability solutions to improve operational awareness and response times. Coordinate routine, scheduled, and emergency maintenance activities to ensure maximum uptime and minimal service disruption. Collaborate with internal engineering, operations, and support teams to understand platform usage, constraints, and improvement opportunities. Travel to datacenter locations, both domestic and international, to support projects, deployments, audits, and vendor engagements. Support and promote the company values through positive interactions with both internal and external partners and customers on a regular basis. Perform additional responsibilities as assigned to support the overall health, stability, and growth of the platform.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level