Your Day to Day: System Reliability & Architecture: Design, build, and maintain highly available and scalable distributed systems. Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and ensure system health. Release Engineering & CI/CD Pipelines: Architect and optimize end-to-end CI/CD pipelines to ensure rapid, safe, and repeatable software delivery. Automation & Engineering: Reduce manual, repetitive work by writing code and automation scripts using languages like Python, PowerShell, or Java to improve efficiency and system reliability. Advanced Problem Solving: Exceptional analytical skills with the ability to correlate and troubleshoot complex issues across diverse platforms. This individual serves as the final point of escalation for high-impact technical challenges, utilizing command-line mastery and interactive shells to resolve deep-system anomalies. Infrastructure & Platform Engineering: Serve as the technical lead for hybrid infrastructure, managing the full lifecycle of on-premises and cloud-based resources. Monitoring & Observability: Develop the strategic roadmap for monitoring and observability across the hybrid environment. Capacity Planning & Performance: Analyze system resource usage to forecast and manage capacity, ensuring systems handle traffic growth. Mentorship & Leadership: Mentor junior team members, conduct code reviews, and promote SRE best practices across the organization. Security & Compliance: Maintain a high-security baseline for all platform services, ensuring compliance with SOX, SOC 2, PCI-DSS, or CIS Benchmarks where applicable. Conduct regular security audits, manage encryption protocols, and ensure all infrastructure follows the principle of least privilege. What We Need From You: 10 years of progressive experience in IT infrastructure engineering, with a proven track record in global enterprise environments Deep expertise and hands-on experience in multiple domains: Operating Systems (Windows and Linux), Enterprise storage, Backup (Veeam /Other solutions), Virtualization (Nutanix), hyper-converged systems, networking and cloud platforms (AWS, GCP) Strong proficiency in Networking fundamentals (TCP/IP, routing, DNS, VPN, SMTP) Experience in Security practices (SSL/TLS, SSH, encryption, LDAP) Experience in implementing configuration management tools (Chef, Ansible) Design and build experience in orchestration using Cloudbolt / Rundeck. Expertise in building and maintaining code-driven infrastructure using Terraform for provisioning, combined with Python, Shell, and PowerShell for advanced scripting and operational automation. Proven ability to lead infrastructure projects from design through deployment. Exceptional problem-solving and strategic thinking skills. Experience building and managing physical infrastructure in on-prem or hybrid environments Experience providing technical guidance to external vendors and partners. Ability to communicate complex technical concepts to both technical and non-technical audiences Excellent skills in communication, documentation, and mentoring others. Create and update technical documentation, standards, and procedures to support consistency and knowledge sharing. Industry certifications such as Microsoft, Redhat, Nutanix, VMware, Backup / Storage solution / Cisco, or security-related credentials are a plus.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed