Applied AI/ML [Multiple Positions Available]

JPMorgan Chase & Co.•New York, NY

3h•$178,000 - $260,000•Onsite

About The Position

Duties: Identify strategic and operational challenges that can be solved through data. Apply machine learning and deep learning techniques to solve financial problems around payment optimization and recommendation. Pose identified problems in formats conducive for quantitative modelling. Apply descriptive and predictive models to address quantified business problems, including linear and non-linear time series techniques for time series prediction. Access and query various databases and data sources to create data sets required for predictive and descriptive analytics. Transform unstructured bank data into high quality data assets to enable tactical and strategic product solutions. Work on highly confidential data assets and perform exploratory analysis to check for missing elements and outliers to establish data integrity. Combine business understanding with theoretical knowledge to augment available data by performing feature engineering. Identify evaluation metrics to measure performance of models. Build both batch and real-time model prediction pipelines and work with data engineers to address scalability issues in testing and production environment. Collaborate with multiple partner teams such as Business Management, Technology, Product Management, and Compliance to deploy solutions into production. Transform results to be communicated as measures of business impact that will enable accurate assessment of risks involved and explain complex concepts to senior management and stakeholders.

Requirements

Master's degree in Data Science, Operation Research, Computer Science, Mathematics, Electrical Engineering, Financial Engineering, Quantitative Finance, Computational Finance, or related field of study plus 2 years of experience in the job offered or as Applied AI/ML Associate, Data Scientist, or related occupation.
Designing and implementing robust data analysis solutions using Python, R, and SQL to extract actionable insights from complex datasets
Reading, analyzing, and executing algorithms on datasets using Hadoop, PySpark, Apache Spark, and Hive to ensure scalability and efficiency in processing structured and unstructured data for advanced analytics and predictive insights
Python-based interactive coding and experimentation using Jupyter Lab
Visualizing data with Tableau
Creating statistical plots with Matplotlib and Seaborn
Implementing version control using Git, GitHub, and Bitbucket to manage code repositories, track changes, and ensure integrity and consistency of project development
Forecasting trends, identifying seasonal patterns, and making data-driven predictions using ARIMA and SARIMA models for time series analysis
Quantifying relationships between variables and predicting continuous or categorical outcomes using regression analysis techniques including Linear, Logistic, GLM, Ridge, and Lasso
Constructing models that categorize data based on input features, optimizing predictive accuracy using classification algorithms including Random Forest, SVM, and Decision Trees
Implementing ensemble methods including XGBoost, Gradient Boosting Machine, Bagging, and Boosting to aggregate models for improved performance and reduced variance
Partitioning datasets into groups based on similarity metrics and creating insights using clustering techniques including K-means, Hierarchical, DBSCAN, and Graph ML
Reducing feature space dimensionality and enhancing computational efficiency and model interpretability using dimensionality reduction methods including Principal Component Analysis, Singular Value Decomposition, and feature engineering
Extracting features from text data using NLP techniques including TF-IDF for text vectorization and POS tagging for syntactic analysis
Capturing semantic relationships between words using fuzzy matching for approximate string comparison and embeddings with FastText, Word2Vec, or GloVe
Comprehensive language processing using Python libraries including NLTK and SpaCy
Optimizing model performance using model evaluation techniques including cross-validation and hyperparameter tuning
Assessing predictive accuracy and reliability using metrics including ROC-AUC, Gini Score, and Precision-Recall
Designing and implementing deep learning architectures including CNNs for image processing, RNNs and LSTMs for sequential data analysis, and Transformers for advanced language modeling and attention mechanisms
Employing optimization techniques including hyperparameter tuning for model refinement, model pruning for efficiency, and gradient descent for minimizing loss functions and enhancing model performance
Predictive modeling using mathematical concepts including linear algebra and multivariate calculus for model formulation, simulation for scenario analysis, hypothesis testing and inference for statistical validation, and probability theory and distributions for data variability.

Responsibilities

Identify strategic and operational challenges that can be solved through data.
Apply machine learning and deep learning techniques to solve financial problems around payment optimization and recommendation.
Pose identified problems in formats conducive for quantitative modelling.
Apply descriptive and predictive models to address quantified business problems, including linear and non-linear time series techniques for time series prediction.
Access and query various databases and data sources to create data sets required for predictive and descriptive analytics.
Transform unstructured bank data into high quality data assets to enable tactical and strategic product solutions.
Work on highly confidential data assets and perform exploratory analysis to check for missing elements and outliers to establish data integrity.
Combine business understanding with theoretical knowledge to augment available data by performing feature engineering.
Identify evaluation metrics to measure performance of models.
Build both batch and real-time model prediction pipelines and work with data engineers to address scalability issues in testing and production environment.
Collaborate with multiple partner teams such as Business Management, Technology, Product Management, and Compliance to deploy solutions into production.
Transform results to be communicated as measures of business impact that will enable accurate assessment of risks involved and explain complex concepts to senior management and stakeholders.