Comcast-posted 8 months ago
Full-time • Mid Level
Onsite • Reston, VA
Broadcasting and Content Providers

Make your mark at Comcast -- a Fortune 30 global media and technology company. From the connectivity and platforms we provide, to the content and experiences we create, we reach hundreds of millions of customers, viewers, and guests worldwide. Become part of our award-winning technology team that turns big ideas into cutting-edge products, platforms, and solutions that our customers love. We create space to innovate, and we recognize, reward, and invest in your ideas, while ensuring you can proudly bring your authentic self to the workplace. Join us. You'll do the best work of your career right here at Comcast. (In most cases, Comcast prefers to have employees on-site collaborating unless the team has been designated as virtual due to the nature of their work. If a position is listed with both office locations and virtual offerings, Comcast may be willing to consider candidates who live greater than 100 miles from the office for the remote option.) Job Summary Comcast's Observability Program provides critical platform services across the company, ensuring teams can monitor, analyze, and optimize application performance. As an Engineer 3, you will play a key role in designing, developing, and maintaining observability platforms, including MetriX, OCP, ADE, Tracing, Grafana, and Elasticsearch. This position requires a high level of technical expertise, problem-solving skills, and experience working in large-scale, high-traffic distributed systems. You will collaborate with teams across Comcast to enhance logging, metrics, tracing, visualization, and alerting capabilities while leading efforts to improve automation, scalability, and reliability.

  • Design, develop, and maintain observability platforms using Python and Golang.
  • Build and enhance React-based frontends for observability dashboards and self-service tools.
  • Implement RESTful APIs and microservices.
  • Develop and manage infrastructure using Kubernetes and Docker.
  • Automate observability tooling and deployment processes using Helm, Terraform, and Ansible.
  • Architect scalable and resilient observability solutions for logging, metrics, and tracing.
  • Optimize and scale Elasticsearch, Prometheus/VictoriaMetrics, and Grafana for high-volume data ingest and queries.
  • Integrate OpenTelemetry for distributed tracing and RED metric generation.
  • Improve real-time anomaly detection pipelines with ADE (Anomaly Detection Exporter).
  • Provide technical leadership in incident response, troubleshooting, and resolution.
  • Collaborate with internal teams to improve observability best practices and operational insights.
  • Participate in an on-call rotation for platform support, including after-hours and weekends.
  • Enhance CI/CD automation to streamline deployments and reduce manual intervention.
  • Lead technical discussions and provide guidance to junior engineers.
  • Advocate for best practices in observability, software development, and automation.
  • Contribute to technical documentation and internal knowledge-sharing initiatives.
  • Python (4+ years) and JavaScript (React) (3+ years)
  • Experience with Kubernetes, Docker, and container orchestration (3+ years)
  • Expertise in REST API development (3+ years)
  • Experience working with Linux environments and shell scripting (Bash, Python, or similar)
  • Strong understanding of CI/CD pipelines and GitOps workflows
  • Familiarity with Helm, Terraform, Ansible, and Packer
  • Experience troubleshooting large-scale distributed systems
  • Excellent communication, collaboration, and problem-solving skills
  • Experience with Golang
  • Knowledge of Microsoft Graph API
  • Background in enterprise-scale observability systems
  • Experience with multi-cloud observability architectures (AWS, Azure, GCP)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service