About The Position

Our client's Enterprise Observability team is looking for a senior-level ELK Stack Subject Matter Expert (SME). The team is responsible for enterprise infrastructure, application, and network observability, with a primary focus on log management and metrics. The selected candidate will be joining a team of skilled engineers with a broad background in enterprise observability. As an ELK Stack Engineer, this role is focused on maintaining the reliability, scalability, and availability of our enterprise Elastic Stack solution. This platform is used for log management, metrics, and observability. The role heavily utilizes automation with tools like Terraform and Ansible and requires the candidate to maintain performance KPIs and define SLOs for the platform.

Requirements

  • BS/MS in CS/Engineering or equivalent, OR 5+ years of experience.
  • 4+ years of experience working directly with the ELK Stack as either an Admin, SME, or Architect.
  • Hands-on experience with designing data pipelines using Logstash, and/or Fluentd/Fluentbit.
  • Expert-level knowledge of the ELK Stack (on-prem and cloud), including best practices related to performance, security, and component setup (Elasticsearch, Logstash, Kibana, Beats).
  • Fluent in writing scripts in languages like Python and (Bash or PowerShell) to automate tasks.
  • Experience in Terraform and Ansible, including syntax, best practices, and managing complex configurations to build and manage infrastructure and applications.
  • Very good working knowledge of Linux OS.
  • Highly self-motivated and directed.
  • Good analytical and problem-solving/troubleshooting abilities.

Responsibilities

  • Maintain and deploy monitoring and alerting systems within the ELK Stack.
  • Design, configure, and maintain our large-scale log aggregation solution using Elasticsearch and Logstash.
  • Set up and manage data ingestion pipelines and transformations using tools like Filebeat, Logstash, and/or Fluentd/Fluentbit.
  • Embrace the mindset of 'automate any task' to improve efficiency.
  • Build and maintain robust monitoring systems using Elasticsearch, Kibana, and Beats to proactively detect potential issues and trigger timely alerts.
  • Maintain associated documentation as it applies to our audit and certification requirements.
  • Participate in troubleshooting, capacity planning, and performance analysis activities related to the ELK Stack.
  • Research new observability requirements and, in many cases, write code to implement them.
  • Possess strong expertise in setting up monitoring policies, rules, and templates, and writing scripts to accomplish observability requirements.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Senior

Education Level

Bachelor's degree

Number of Employees

11-50 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service