The Bank of New York Mellon-posted 3 months ago
$83,000 - $155,000/Yr
Full-time • Mid Level
Jersey City, NJ
5,001-10,000 employees
Securities, Commodity Contracts, and Other Financial Investments and Related Activities

At BNY, our culture allows us to run our company better and enables employees' growth and success. As a leading global financial services company at the heart of the global financial system, we influence nearly 20% of the world's investible assets. Every day, our teams harness cutting-edge AI and breakthrough technologies to collaborate with clients, driving transformative solutions that redefine industries and uplift communities worldwide. Recognized as a top destination for innovators and champions of inclusion, BNY is where bold ideas meet advanced technology and exceptional talent. Together, we power the future of finance - and this is what #LifeAtBNY is all about. Join us and be part of something extraordinary. We're seeking a future team member for the role of SRE / Site Reliability Engineer to join our Technology team. This role is located in Jersey City, NJ.

  • Drive reliability and performance by defining SLOs/SLIs, improving observability, and proactively identifying and addressing system bottlenecks across cloud environments.
  • Automate infrastructure and operations using Terraform, Kubernetes, and CI/CD tools to eliminate toil and enable scalable, fault-tolerant deployments.
  • Collaborate cross-functionally with product, infrastructure, and DevOps teams to reduce incidents, build resilient services, and ensure architectural clarity.
  • Lead incident management by participating in on-call rotations, conducting postmortems, and implementing automated recovery to minimize downtime.
  • Build and maintain monitoring systems with tools like Prometheus, Grafana, AppDynamics, and Splunk to support real-time alerting and root cause analysis.
  • Develop platform tooling and pipelines for container orchestration, third-party integrations, and cloud-native operations to improve system efficiency and reliability.
  • Maintain and improve live services by measuring and monitoring latency and overall system health, working closely with tech support and operations teams.
  • Leverage and define KPIs to understand service performance and identify corrective actions.
  • Create, manage, and use dashboards for continuous monitoring and health checks of applications and underlying infrastructure.
  • Design and implement solutions to customer friction points and improve the entire lifecycle of services from inception through sustainment.
  • Assist in creating and maintaining automation to improve reliability and velocity in addressing issues during regular maintenance tasks.
  • Mentor engineers and champion SRE best practices, embedding a reliability-first culture and ensuring technical excellence across engineering teams.
  • Bachelor's degree in computer science or a related discipline, or equivalent work experience required; advanced degree preferred.
  • 5-8 years of related experience; experience in the securities or financial services industry is a plus.
  • Strong expertise in cloud infrastructure (Azure, AWS, or GCP), containerization (Docker, Kubernetes), and Infrastructure as Code (Terraform, Helm).
  • Proficiency in observability and monitoring tools such as Prometheus, Grafana, AppDynamics, Datadog, Splunk, and experience with incident response and on-call support.
  • Solid programming and scripting skills in languages like Python, Go, or Java, with a focus on automation, tooling, and system integration.
  • Deep understanding of SRE principles, including SLAs, SLOs, error budgets, postmortems, and reliability-focused system design.
  • Familiarity with automated testing, DevSecOps practices, CI/CD methods, performance engineering, and security controls.
  • Strong collaboration and communication skills, with experience working in Agile environments and partnering with cross-functional engineering, product, and operations teams.
  • Previous success in technical engineering and coding experience beyond simple scripts.
  • Highly competitive compensation, benefits, and wellbeing programs rooted in a strong culture of excellence and our pay-for-performance philosophy.
  • Access to flexible global resources and tools for your life's journey.
  • Focus on your health, foster your personal resilience, and reach your financial goals as a valued member of our team.
  • Generous paid leaves, including paid volunteer time, that can support you and your family through moments that matter.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service