Dynatrace Observability Engineer

MLabsMcLean, VA
1dOnsite

About The Position

Our firm has been retained to find a highly skilled Dynatrace Observability Engineer to design, implement, and optimize a world-class enterprise observability ecosystem. In this role, you will be the technical lead for our client's visibility strategy, leveraging Dynatrace’s APM, Infrastructure Monitoring, RUM, and Davis AI capabilities to ensure system reliability and proactive incident detection across hybrid and multi-cloud environments. You will collaborate directly with SREs, DevOps, Cloud Engineering, and business stakeholders to build scalable solutions that empower faster troubleshooting and data-driven decision-making.

Requirements

  • Tenure: Minimum 10 years of hands-on experience with Dynatrace APM (SaaS or Managed environment).
  • Observability Core: Strong understanding of the four pillars: metrics, logs, traces, and events.
  • Infrastructure: Deep knowledge of cloud platforms (AWS, Azure, or GCP) and/or Kubernetes.
  • Configuration: Proven experience configuring dashboards, alerts, tagging rules, and management zones.
  • Architecture: Solid understanding of microservices, distributed systems, APIs, and application performance concepts.
  • Automation: Proficiency with automation scripting tools (Python, Bash, PowerShell, YAML).
  • DevOps: Familiarity with CI/CD tools (Jenkins, Azure DevOps, GitHub Actions, or GitLab).

Nice To Haves

  • Dynatrace Certifications (Associate, Professional, or Master).
  • Experience with Monaco (Monitoring-as-Code) or Terraform.
  • Familiarity with OpenTelemetry or Chaos Engineering.
  • Background integrating Dynatrace with tools like ServiceNow, PagerDuty, or Grafana.

Responsibilities

  • Platform Ownership: Administer and maintain the Dynatrace platform, including upgrades, governance, and the deployment of OneAgents/ActiveGates across cloud, on-premise, and Kubernetes/OpenShift environments.
  • Observability Engineering: Implement end-to-end monitoring for applications, APIs, and logs. Configure distributed tracing, service flow mapping, and synthetic monitoring.
  • Performance & Reliability: Utilize Davis AI insights to analyze latency and bottlenecks. Partner with app teams for root-cause remediation and define SLOs/SLIs aligned with business KPIs.
  • Automation & Integration: Build automation scripts (Python, Bash, PowerShell) to streamline deployments. Integrate Dynatrace with CI/CD pipelines, ITSM tools (ServiceNow/PagerDuty), and logging platforms like Splunk.
  • Incident Management: Act as the Subject Matter Expert (SME) during major incidents, providing deep-dive observability insights and post-incident reviews.

Benefits

  • Technology Leadership: Serve as the primary SME for observability, influencing the roadmap for high-scale cloud environments.
  • High-Impact Work: Directly improve customer experience and system uptime for a major enterprise platform.
  • Professional Environment: Join a collaborative culture that values continuous improvement and technical ownership.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service