Principal Resiliency Engineer

DTCCCoppell, TX
Hybrid

About The Position

The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high-quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance. We are seeking a Principal Reliability Engineer to join our Reliability Architecture team and play a pivotal role in shaping the future of enterprise observability. This hands-on engineering role is designed for a candidate who thrives in fast-paced, collaborative environments and is eager to influence architectural standards across the organization. You will engineer prototype workloads that simulate real-world business applications, demonstrating how proposed resiliency and observability strategies and standards can be embedded into modern, cloud-native environments. Your work will directly support enterprise-wide adoption of our observability mandate and contribute to platform modernization and resiliency goals.

Requirements

  • Minimum of 8 years in distributed application design and implementation
  • Bachelor’s degree in computer engineering or equivalent experience

Nice To Haves

  • 5+ years in enterprise Java technologies and open standards
  • 5+ years in infrastructure, networking, middleware, and database architecture
  • 5+ years in highly available architecture and disaster recovery
  • 3+ years in containers and cloud-based solution delivery
  • Strong troubleshooting and performance analysis skills
  • Java, Python, Bash, SQL
  • CI/CD pipelines and automation frameworks (e.g., Jenkins, Selenium)
  • OpenTelemetry, distributed tracing, metrics generation, logging
  • Chaos engineering tools (e.g., Gremlin, AWS FIS)
  • AWS and Azure cloud environments
  • Infrastructure as Code (IaC), container orchestration, hybrid deployments

Responsibilities

  • Build and deploy prototype applications that showcase observability capabilities across hybrid environments (on-premises, AWS, Azure, SaaS).
  • Conduct rigorous testing with simulated disruptions, environmental failures, and performance scenarios to validate proposed observability standards.
  • Collaborate with platform teams, application owners, and external vendors to integrate observability into real workloads.
  • Produce runbooks, configuration guides, architectural patterns, reusable dashboards, and findings to support enterprise enablement and scale.
  • Provide feedback and technical insight to IT Architecture leadership and teams to influence enterprise strategy.
  • Conduct technical evaluations of proposed standards for observability visualization, distributed tracing, enterprise logging, metrics operations, open standards adoption, event correlation, and alerting notification.
  • Host sprint retrospectives and demos for engineering, business, and executive stakeholders to validate and promote adoption.

Benefits

  • Competitive compensation, including base pay and annual incentive
  • Comprehensive health and life insurance and well-being benefits, based on location
  • Pension / Retirement benefits
  • Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service