DevOps Engineer (Ansible & ELK Stack)

NTT DATA Romania SASibiu, AR
3d

About The Position

We are seeking a Senior DevOps Engineer to design, build, and operate scalable, secure, and highly available infrastructure for our cloud and on-prem environments. The ideal candidate has deep experience with Linux, logging and observability (the ELK Stack), configuration management (Ansible), container orchestration (Kubernetes), infrastructure as code (Terraform), virtualization, and monitoring tools such as Grafana and Prometheus. This role will partner with development teams, platform engineers, and SREs to improve deployment velocity, reliability, and operational excellence.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • Minimum 5 years in infrastructure/DevOps or related roles.
  • Deep experience with Linux administration and troubleshooting.
  • Hands-on experience with ELK Stack (deployment, scaling, index management, Kibana dashboards).
  • Solid knowledge of Ansible (playbooks, roles, inventories).
  • Solid scripting skills (Bash, Python, or Go).
  • Experience with virtualization technologies (e.g., VMware, KVM, or Hyper-V).
  • Familiarity with container tooling (Docker, container registries) and orchestration patterns.
  • Monitoring and observability experience with Prometheus (alerting rules, exporters) and Grafana (dashboards, alerts).
  • Experience with CI/CD systems and Git-based workflows.
  • Excellent command of both spoken and written English.

Nice To Haves

  • Infrastructure as code experience using Terraform (modules, state backends, CI integration)
  • Operational experience with Kubernetes

Responsibilities

  • Build and run configuration management and automation using Ansible.
  • Maintain and improve logging and observability with the ELK Stack (Elasticsearch, Logstash/Beats, Kibana).
  • Operate and optimize Linux servers across cloud and on-prem environments (RHEL/CentOS/Ubuntu).
  • Design and manage virtualization platforms (e.g., VMware, KVM, or similar), including VM lifecycle, templates, and networking.
  • Implement CI/CD pipelines and deployment automation (Jenkins, GitLab CI, GitHub Actions, or similar).
  • Define and implement monitoring, alerting, and SLO/SLI workflows.
  • Implement backup, DR plans, and capacity planning.
  • Troubleshoot production issues, lead post-incident reviews, and drive continuous improvement.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service