This role involves creating new technology components, building software solutions for large-scale infrastructure management, and utilizing AWS technologies like S3, SQS, SNS, Step Function Workflows, and Lambda to solve data center problems. The position requires designing and implementing scalable, fault-tolerant architectures, working with databases such as Redshift and DynamoDB, and troubleshooting Java, Ruby, JavaScript, and Python-based applications. A key aspect of the role is continuous collaboration with teams to identify automation opportunities, design and implement automated workflows, and automate repetitive tasks in SOPs and runbooks. The role also includes implementing monitoring and alerting solutions, analyzing logs and metrics to identify root causes of problems, and developing runbooks for ticket resolution. Collaboration with development teams to enhance runbooks, provide incident response support, and perform periodic Change Management executions are also part of the responsibilities. The role will track Continuous Deployment implementation, support development teams, implement constraint and value manager changes, manage cross-organizational campaigns, address OS and fleet updates, implement infrastructure best practices, resolve Application Security action items, and participate in major engineering projects with global stakeholders. Additionally, the role involves mentoring System Engineers and interns on Amazon Tools and Technologies.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level