Senior Site Reliability Engineer

Fidelity InvestmentsWestlake, TX
2dHybrid

About The Position

Position Description: Provides day-to-day support and ensures stability for financial applications hosted on Amazon Web Services (AWS), including InvestOne, payment systems, and other price calculation applications. Builds and maintains Kubernetes (EKS) clusters for containerized workloads supporting batch processing and real-time financial calculations. Automates deployment and infrastructure provisioning, using Terraform, Jenkins, and Continuous Integration/Continuous Delivery (CI/CD) pipelines. Supports applications performing exchange-traded funds (ETF), net asset value (NAV), and stock price calculations with high accuracy and reliability. Develops and maintains cloud-native solutions using AWS services Relational Database Service (RDS), EC2, Lambda, Step Functions, Fargate, Elastic Kubernetes Service (EKS), and S3. Ensures compliance with financial data regulations through Identity and Access Management (IAM) policies, encryption, and secure networking. Uses Python, Shell scripting, and HashiCorp Configuration Language to build automation tools and enhance operational efficiency. Uses Ansible to automate various infrastructure related processes. Monitors application health and performance, using Datadog, CloudWatch, and custom metrics dashboards.

Requirements

  • Bachelor’s degree in Computer Science, Information Science, Engineering, Information Technology, Information Systems, Cybersecurity, or a closely related field (or foreign education equivalent) and three (3) years of experience as a Senior Site Reliability Engineer (or closely related occupation) supporting Web-based applications in AWS and on-premise environments.
  • Or, alternatively, Master’s degree in Computer Science, Information Science, Engineering, Information Technology, Information Systems, Cybersecurity, or a closely related field (or foreign education equivalent) and one (1) year of experience as a Senior Site Reliability Engineer (or closely related occupation) supporting Web-based applications in AWS and on-premise environments.
  • Demonstrated Expertise (“DE”) supporting mutual fund accounting applications and infrastructure in on-premise and Cloud technology platforms, using Linux, Windows, Oracle, and REST APIs; performing fund accounting functions, managing Net Asset Value (NAV), Mill-Rate (MILS), yields, trade processing, data securities, recordkeeping, corporate action, money management, pricing, and regulatory reporting, using FIS InvestOne.
  • DE performing systems support by analyzing observability, resiliency, availability, and performance of applications, using Datadog, Splunk, and Telemetry; supporting applications by reprocessing failed transactions, correcting data, and ensuring that end-of-day processes are completed, using Control-M, Axway, and PDS; and troubleshooting critical infrastructure issues on AWS Cloud and on-premises environments, using AWS Command Line Interface (AWS CLI) and Kubernetes.
  • DE designing and deploying infrastructure across global data centers, including BareMetal, Citrix SDX, NetScaler VPX, and XenServers; and configuring RedHat Satellites and Capsules, working on Citrix Netscaler systems for load balancing, and deploying Kubernetes infrastructure to support service team applications.
  • DE automating infrastructure and operational processes using Terraform, Ansible, and Chef to configure Linux systems, patch security updates, and provision AWS Cloud resources; and implementing Continuous Integration/Continuous Deployment workflows for automated deployments, system updates, and infrastructure maintenance, using Jenkins, GitHub, and Bitbucket.

Responsibilities

  • Implements disaster recovery procedures and contingency plans for critical financial systems.
  • Collaborates with product teams in Agile environments to deliver scalable and secure solutions.
  • Identifies and resolves production and non-production issues during critical outages impacting financial operations.
  • Creates automated solutions to reduce manual intervention and human errors in batch processing.
  • Performs root cause analysis for high-impact incidents.
  • Analyzes the observability, resiliency, availability, and performance of applications.
  • Triages, performs deep dives, and executes root cause analysis for issues in financial systems.
  • Provides resolution of business and system issues by initiating new solutions.
  • Supports and optimizes batch jobs for end-of-day pricing, reconciliation, and reporting workflows.
  • Expands and modifies systems to improve workflow and meet evolving business needs.
  • Consults with stakeholders to align system design with business principles and operational goals.
  • Participates in the development of tools to streamline operational procedures and data analysis.
  • Evaluates system designs for reliability, performance, and cost-effectiveness.
  • Delivers technical solutions to support service requests and enhance financial application support.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service