Network and Infrastructure Expert

TOPPAN SecurityAddis, LA
2d

About The Position

Design, build, deploy, and maintain an organization's core IT systems (servers, networks, storage, cloud services) to ensure reliable, efficient, and secure application delivery, focusing on tasks like automation, performance monitoring, security, Disaster Recovery and collaborating with development teams to support business goals. Translate architectural plans into operational systems, manage infrastructure changes, provide technical support.

Requirements

  • Degree in Computer Science, Computer Engineering, Software Engineering and related field from a recognized institution.
  • Minimum of 3+ years of experience in handling enterprise grade infrastructure and Network Engineering for production environment.
  • Technical Skills: Cloud platforms (AWS, Huawei Cloud Stack), virtualization (VMware, Xen Server), Networking (L2 – L7 stack), scripting (Python, Bash Shell), Linux containerization (Docker, Kubernetes), Observability tooling (Prometheus, Grafana, Loki, ELK), K8S cluster, Ansible, Bash shell, Python, GitOps.
  • Soft Skills: Problem-solving, communication, collaboration, strategic thinking, rigor

Responsibilities

  • Design & Implementation: Create and deploy scalable infrastructure (on-prem, cloud, hybrid) by setting up the Baremetal servers, virtual machines, configuring switches, Firewalls, IP SANs, K8S clusters and monitoring tools
  • Maintenance & Support: Manage servers, networks, storage, and Linux and Windows OS, troubleshoot issues, and prepare RCA
  • Monitoring & Observability: Build and maintain monitoring systems (metrics, logs, traces) to gain deep insights into system health (CPU, RAM, Disk usages, I/O stats, Network latency, and availability)
  • Automation: Implementing Infrastructure as a Code (IaC) with tools like Ansible, automate provisioning, and deploy necessary scripts for routine maintenance task
  • Security: Implement security controls, manage firewalls, ensure compliance, segment the network with proper IP planning, VLANs and ACLs
  • Performance & Reliability: Monitor system health/resource utilization, optimize performance metrics, manage backup/disaster recovery strategies as well as test recovery plans, proactively identifying and resolving site reliability issues. Solid understanding of High Availability and Fault-tolerance
  • Collaboration: Work with developers, and architects to meet project needs.
  • Documentation: Create and maintain technical documentation and architecture diagrams.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service