Elastic Search / ELK Stack engineer - R01564999

BrillioMc Lean, VA
$70 - $74Hybrid

About The Position

We are seeking a highly experienced Senior Observability Engineer with deep expertise in ESS (Elastic Stack) to lead and accelerate the development of enterprise-grade observability capabilities across mission-critical applications. This role requires a hands-on SME who can design, build, and scale observability dashboards, APM, tracing, and monitoring solutions exclusively within ESS. The candidate will play a key role in transforming current monitoring into a proactive, intelligent, and scalable observability ecosystem. This is a high-impact, fast-paced engagement (target < 6 months) requiring ownership, technical depth, and execution excellence.

Requirements

  • Strong hands-on experience with ESS (Elastic Stack): Elasticsearch, Logstash, Kibana, Beats / Elastic Agent, Elastic APM.
  • Proven experience building enterprise-scale observability dashboards in ESS.
  • Deep understanding of: Microservices architecture, Kubernetes / OpenShift (OCP).
  • Experience with: APM, distributed tracing, logging, metrics correlation.
  • Ability to design multi-layer observability (infra → platform → app).

Nice To Haves

  • Experience with: Synthetic monitoring tools integrated with ESS, Real User Monitoring (RUM), Service maps and dependency graphs.
  • Knowledge of: CI/CD observability integration, Alerting frameworks within Elastic.
  • Scripting: Python / Shell / Groovy.

Responsibilities

  • Design and implement end-to-end observability solutions using ESS (Elastic Stack).
  • Build a centralized observability layer covering all MF applications.
  • Ensure block-level aggregation with drill-down to: Application-level metrics, APM traces, Logs and events, Service dependencies.
  • Develop and scale a large backlog of ESS dashboards, including but not limited to: Cluster Health (OCP/K8s), API & APM Dashboards, Service Health & Dependency Monitoring, Pod Status / Restart / Scaling Metrics, HTTP Status Analytics (200/400/500 trends), Transaction Processing Metrics, Infra Metrics (CPU, Memory, Disk, Network), Synthetic Monitoring & Availability.
  • Build intuitive, drill-down dashboards from MF Block → Service → Application level.
  • Expand ESS-based: Application Performance Monitoring (APM), Distributed tracing, Real User Monitoring (RUM), Synthetic monitoring.
  • Enable end-to-end traceability across microservices.
  • Design and implement smart alerting rules.
  • Move from reactive → proactive detection.
  • Reduce noise, improve signal quality.
  • Define SLOs, SLIs, and error budgets.
  • Enhance anomaly detection and trend analysis.
  • Work closely with: EOT Observability Team, Internal CDLs, Application teams.
  • Act as ESS Observability SME.
  • Provide guidance, standards, and best practices.

Benefits

  • Great Place to Work® certification
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service