Director, Software Product Management & RE

Morgan Stanley•Boston, MA

About The Position

Morgan Stanley Services Group Inc. is seeking a Director, Software Product Management & RE in Boston, MA to deploy and maintain comprehensive monitoring, logging, and alerting. Handle incident response, root cause analysis, and post-mortem reviews for time resolution of production outages and participate in 24/7 on call rotation. Handle AI Enhanced Knowledge Management efforts of the Production Support team. Embed content and documentation from multiple sources, and maintain centralized knowledge repository and regular validation. Build and implement self-healing capabilities for detecting and remediating Production issues. Plan, execute, and automate high integrity data migrations between On- Prem databases and replicate to Snowflake on Azure Cloud. Handle cross-vendor incident response during Outages and maintain runbooks for recovery scenarios. Complete user change requests and enhancements in Production environments, and ensure changes go through Performance, Load, and Compliance testing before deployment. Act as primary operational point of contact for business teams and handle business flows such as RFBs and End of the day processing. Identify and resolve operational bottle necks in Aladdin workflows, Data feeds and Batch processing.

Requirements

  • Requires a Master's degree in Engineering (any), Computer Science, or a related field of study.
  • Requires three (3) years of experience in the position offered or three (3) years as an Associate, Business Data Analyst, Software Developer, or a closely related occupation.
  • Requires three (3) years of experience with the following skills: Generative AI; scripting and automation using Linux and Bash; Snowflake Data Modelling; SQL; data replication from on-premise legacy databases including Oracle, SQL Server, DB2, and Sybase to cloud databases including Snowflake using High Volume Replication (HVR) tool; data analysis and data pipelines development using Python; Data Pipelines Code; DataIKU and Airflow; scheduling, automating, and optimizing batch jobs using Autosys; User Accepting testing and Software Validation; data management and data integrity checks; working on user requests for Software and Process enhancements; data visualization tools including Tableau and PowerBI; CI/CD process; monitoring tools including DataDog, Splunk, Prometheus, and Grafana; automating incident response and recovery workflows; creating tickets and reports using ServiceNow; incident management using PagerDuty; preparing Post Mortem decks for Root Cause Analysis and Impact Mitigation; Agile Methodologies; Kanban; Jira Boards; Version control including Git or BitBucket; Cloud based Data warehousing including Snowflake; Public Cloud platforms including Azure or AWS; cloud infrastructure monitoring using Terraform; business impact analysis using BigPanda; Synthetic monitoring and Application Programmable Interface (API) performance testing using APICA.

Responsibilities

  • Deploy and maintain comprehensive monitoring, logging, and alerting.
  • Handle incident response, root cause analysis, and post-mortem reviews for time resolution of production outages and participate in 24/7 on call rotation.
  • Handle AI Enhanced Knowledge Management efforts of the Production Support team.
  • Embed content and documentation from multiple sources, and maintain centralized knowledge repository and regular validation.
  • Build and implement self-healing capabilities for detecting and remediating Production issues.
  • Plan, execute, and automate high integrity data migrations between On- Prem databases and replicate to Snowflake on Azure Cloud.
  • Handle cross-vendor incident response during Outages and maintain runbooks for recovery scenarios.
  • Complete user change requests and enhancements in Production environments, and ensure changes go through Performance, Load, and Compliance testing before deployment.
  • Act as primary operational point of contact for business teams and handle business flows such as RFBs and End of the day processing.
  • Identify and resolve operational bottle necks in Aladdin workflows, Data feeds and Batch processing.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service