OSTTRA-posted 2 days ago
$90,000 - $122,000/Yr
Full-time • Mid Level
Princeton, NY
1,001-5,000 employees

Site Reliability Engineer – Datadog Specialist The Team: The IT Operations team at S&P Dow Jones Indices (S&P DJI) is tasked with owning and maintaining the Production IT systems that underpin S&P DJI's index platforms and applications, ensuring their high availability. The team prioritizes service availability, service request management, and continuous improvement of support processes through collaborative engagement with business stakeholders, operations, infrastructure, and development teams. Additionally, the team is involved in critical activities such as incident management, emergency response, change management, problem management, and capacity planning to support the robustness of S&P DJI's index platforms.

  • Design, implement, and manage end-to-end observability using Datadog APM, DBM, log pipelines, synthetic monitoring, and AI-driven alerting.
  • Maintain production monitoring, respond to incidents, and lead root cause analysis using Datadog, Splunk, and ELK.
  • Enhance automation and testing frameworks using Java, Spring Boot, Selenium, Cucumber, Playwright, and Jenkins.
  • Operate AWS services including EC2, ECS, RDS, S3, DynamoDB, and Secrets Manager.
  • Contribute to CI/CD practices and containerization technologies.
  • Integrate monitoring with PagerDuty and ServiceNow for incident workflows.
  • Participate in post-incident reviews, disaster recovery testing, and SRE process improvements.
  • 4 years of experience in SRE, DevOps, or platform engineering roles.
  • Bachelor's degree in Computer Science or similar field of study
  • Proven expertise in Datadog APM, DBM, logging, and infrastructure monitoring.
  • Strong programming skills in Java and Python.
  • Hands-on experience with AWS, including operational management of core services.
  • Experience with CI/CD pipelines and container orchestration technologies.
  • Familiarity with ITSM tools (ServiceNow, PagerDuty).
  • Understanding of observability best practices, log correlation, and distributed tracing.
  • Excellent troubleshooting, documentation, and communication skills.
  • Datadog certifications (APM, Logs, Fundamentals).
  • Exposure to other monitoring tools like Splunk, Dynatrace, or ELK.
  • Knowledge of Agile/Scrum and globally distributed team c ollaboration.
  • Experience with Index/Benchmarks, Asset Management, or Portfolio Investment
  • Health & Wellness: Health care coverage designed for the mind and body.
  • Flexible Downtime: Generous time off helps keep you energized for your time on.
  • Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
  • Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
  • Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
  • Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service