Senior Engineer IT Infrastructure Engineering

IHGAtlanta, GA
3h$95,000 - $145,000Hybrid

About The Position

Your Day to Day: System Reliability & Architecture: Design, build, and maintain highly available and scalable distributed systems. Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and ensure system health. Release Engineering & CI/CD Pipelines: Architect and optimize end-to-end CI/CD pipelines to ensure rapid, safe, and repeatable software delivery. Automation & Engineering: Reduce manual, repetitive work by writing code and automation scripts using languages like Python, PowerShell, or Java to improve efficiency and system reliability. Advanced Problem Solving: Exceptional analytical skills with the ability to correlate and troubleshoot complex issues across diverse platforms. This individual serves as the final point of escalation for high-impact technical challenges, utilizing command-line mastery and interactive shells to resolve deep-system anomalies. Infrastructure & Platform Engineering: Serve as the technical lead for hybrid infrastructure, managing the full lifecycle of on-premises and cloud-based resources. Monitoring & Observability: Develop the strategic roadmap for monitoring and observability across the hybrid environment. Capacity Planning & Performance: Analyze system resource usage to forecast and manage capacity, ensuring systems handle traffic growth. Mentorship & Leadership: Mentor junior team members, conduct code reviews, and promote SRE best practices across the organization. Security & Compliance: Maintain a high-security baseline for all platform services, ensuring compliance with SOX, SOC 2, PCI-DSS, or CIS Benchmarks where applicable. Conduct regular security audits, manage encryption protocols, and ensure all infrastructure follows the principle of least privilege.

Requirements

  • 10 years of progressive experience in IT infrastructure engineering, with a proven track record in global enterprise environments
  • Deep expertise and hands-on experience in multiple domains: Operating Systems (Windows and Linux), Enterprise storage, Backup (Veeam /Other solutions), Virtualization (Nutanix), hyper-converged systems, networking and cloud platforms (AWS, GCP)
  • Strong proficiency in Networking fundamentals (TCP/IP, routing, DNS, VPN, SMTP)
  • Experience in Security practices (SSL/TLS, SSH, encryption, LDAP)
  • Experience in implementing configuration management tools (Chef, Ansible)
  • Design and build experience in orchestration using Cloudbolt / Rundeck.
  • Expertise in building and maintaining code-driven infrastructure using Terraform for provisioning, combined with Python, Shell, and PowerShell for advanced scripting and operational automation.
  • Proven ability to lead infrastructure projects from design through deployment.
  • Exceptional problem-solving and strategic thinking skills.
  • Experience building and managing physical infrastructure in on-prem or hybrid environments
  • Experience providing technical guidance to external vendors and partners.
  • Ability to communicate complex technical concepts to both technical and non-technical audiences
  • Excellent skills in communication, documentation, and mentoring others.
  • Create and update technical documentation, standards, and procedures to support consistency and knowledge sharing.

Nice To Haves

  • Industry certifications such as Microsoft, Redhat, Nutanix, VMware, Backup / Storage solution / Cisco, or security-related credentials are a plus.

Responsibilities

  • Design, build, and maintain highly available and scalable distributed systems.
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and ensure system health.
  • Architect and optimize end-to-end CI/CD pipelines to ensure rapid, safe, and repeatable software delivery.
  • Reduce manual, repetitive work by writing code and automation scripts using languages like Python, PowerShell, or Java to improve efficiency and system reliability.
  • Serve as the final point of escalation for high-impact technical challenges, utilizing command-line mastery and interactive shells to resolve deep-system anomalies.
  • Serve as the technical lead for hybrid infrastructure, managing the full lifecycle of on-premises and cloud-based resources.
  • Develop the strategic roadmap for monitoring and observability across the hybrid environment.
  • Analyze system resource usage to forecast and manage capacity, ensuring systems handle traffic growth.
  • Mentor junior team members, conduct code reviews, and promote SRE best practices across the organization.
  • Maintain a high-security baseline for all platform services, ensuring compliance with SOX, SOC 2, PCI-DSS, or CIS Benchmarks where applicable.
  • Conduct regular security audits, manage encryption protocols, and ensure all infrastructure follows the principle of least privilege.

Benefits

  • We offer a comprehensive package of benefits including paid-time off, medical/dental/vision, 401k, and other benefits to employees.
  • This role is also eligible for bonus pay.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service