About The Position

Are you ready to make an impact at DTCC? Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive investment in your professional development? At DTCC, we are at the forefront of innovation in the financial markets. We are committed to helping our employees grow and succeed. We believe that you have the skills and drive to make a real impact. We foster a thriving internal community and are committed to creating a workplace that looks like the world that we serve. The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high-quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance. As a Senior Application Support Engineer (SRE), you will play a critical role in improving the reliability, scalability, and performance of DTCC’s mission-critical applications. You will go beyond traditional application support by applying Site Reliability Engineering (SRE) principles to drive system stability, reduce operational risk, and enhance overall resilience. This role sits at the intersection of engineering, infrastructure, and operations, where you will influence application design, strengthen observability, and proactively prevent incidents before they occur. You will partner closely with application development, infrastructure, network, and security teams to improve operational readiness, monitoring, and system reliability, while helping promote a strong SRE culture across the organization.

Requirements

  • Bachelor’s degree preferred or equivalent practical experience
  • 6-8 years of experience in application support, SRE, or similar role
  • Strong understanding of SRE principles, reliability engineering, and production support best practices
  • Working knowledge of Unix/Linux, Windows, Mainframe, and SQL/PLSQL
  • Experience working in application or production support environments with complex systems
  • Proven ability in root cause analysis, incident management, and problem resolution
  • Hands-on experience with monitoring and observability tools (e.g., Splunk, Dynatrace)
  • Familiarity with cloud platforms (AWS preferred) and distributed systems
  • Experience with DevOps tools and automation practices
  • Strong problem-solving mindset and passion for improving system reliability
  • Working knowledge of Unix/Linux, Windows, and SQL/PLSQL
  • Exposure to scripting languages such as Python, Shell, or similar
  • Familiarity with tools such as AutoSys, ServiceNow, or JIRA
  • Strong communication and collaboration skills across cross-functional teams

Responsibilities

  • Apply SRE principles and practices to improve system reliability, scalability, and performance
  • Evaluate system behavior under failure scenarios and contribute to failure mode analysis and resilience design
  • Define and implement strategies for fault tolerance, recovery, and disaster readiness
  • Partner with development teams to implement monitoring, alerting, and observability solutions
  • Define actionable alerts and establish SLIs / SLOs to measure system health
  • Drive automation of operational processes to reduce manual effort and improve recovery times
  • Participate in major incident resolution, validating diagnosis and driving root cause analysis (RCA)
  • Lead or contribute to post-incident reviews, identifying long-term fixes to prevent recurrence
  • Improve overall system stability by addressing recurring issues and operational gaps
  • Work closely with development teams to embed SRE practices into the software development lifecycle
  • Participate in design reviews, sprint planning, and standups to advocate for reliability, scalability, and observability
  • Ensure non-functional requirements (NFRs) such as availability and performance are considered early

Benefits

  • Competitive compensation, including base pay and annual incentive
  • Comprehensive health and life insurance and well-being benefits, based on location
  • Pension / Retirement benefits
  • Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service