Senior Manager, Data & Analytics

EXL

6d•Remote

About The Position

Design, develop, and maintain Extract/Transform/Load (ETL) workflows and large-scale data pipelines using Python, SQL (Structured Query Language), Hive, PySpark and BigQuery. Orchestrate, optimize, and deploy end-to-end data pipelines for ingesting, processing, and transforming large volumes of structured and unstructured data using Hadoop, Cloud Composer, Airflow DAGs, and DataProc clusters. Implement CI/CD pipelines using GitHub and Jenkins to automate deployments and ensure high availability of data pipelines. Develop and deploy classification and regression models, uplift modeling and predictive analysis to optimize marketing, sales and operational strategies. Utilize feature engineering, model validation and hyperparameter tuning techniques to enhance model accuracy and robustness. Implement A/B testing frameworks and causal inference methods to assess and refine data-driven decisions. Design and implement Large Language Model (LLM)-based solutions for automated customer interactions, intelligent search, and content generation. Build and optimize Retrieval-Augmented Generation (RAG) architectures leveraging Vector Databases and embedding models for domain-specific knowledge retrieval. Develop prompt engineering strategies, supervised fine-tuning methodologies, and Agentic AI workflows to create adaptive and autonomous AI-driven solutions. Use shell scripting and scheduling tools like Zeke and Airflow to manage job execution and monitoring. Develop data dictionaries and metadata repositories to document data lineage and improve accessibility. Work closely with cross-functional teams, including business analysts, data scientists, and IT teams, to ensure seamless integration of data solutions. Create and manage data models to build interactive Tableau and Power BI dashboards to provide key insights to business users and management. Position may work at various and unanticipated worksites throughout the United States. Telecommuting permitted.

Requirements

Python
SQL (Structured Query Language)
Hive
PySpark
BigQuery
Hadoop
Cloud Composer
Airflow DAGs
DataProc clusters
CI/CD pipelines
GitHub
Jenkins
Classification and regression models
Uplift modeling
Predictive analysis
Feature engineering
Model validation
Hyperparameter tuning techniques
A/B testing frameworks
Causal inference methods
Large Language Model (LLM)
Retrieval-Augmented Generation (RAG) architectures
Vector Databases
Embedding models
Prompt engineering strategies
Supervised fine-tuning methodologies
Agentic AI workflows
Shell scripting
Scheduling tools like Zeke and Airflow
Data dictionaries
Metadata repositories
Tableau
Power BI

Responsibilities

Design, develop, and maintain Extract/Transform/Load (ETL) workflows and large-scale data pipelines using Python, SQL (Structured Query Language), Hive, PySpark and BigQuery.
Orchestrate, optimize, and deploy end-to-end data pipelines for ingesting, processing, and transforming large volumes of structured and unstructured data using Hadoop, Cloud Composer, Airflow DAGs, and DataProc clusters.
Implement CI/CD pipelines using GitHub and Jenkins to automate deployments and ensure high availability of data pipelines.
Develop and deploy classification and regression models, uplift modeling and predictive analysis to optimize marketing, sales and operational strategies.
Utilize feature engineering, model validation and hyperparameter tuning techniques to enhance model accuracy and robustness.
Implement A/B testing frameworks and causal inference methods to assess and refine data-driven decisions.
Design and implement Large Language Model (LLM)-based solutions for automated customer interactions, intelligent search, and content generation.
Build and optimize Retrieval-Augmented Generation (RAG) architectures leveraging Vector Databases and embedding models for domain-specific knowledge retrieval.
Develop prompt engineering strategies, supervised fine-tuning methodologies, and Agentic AI workflows to create adaptive and autonomous AI-driven solutions.
Use shell scripting and scheduling tools like Zeke and Airflow to manage job execution and monitoring.
Develop data dictionaries and metadata repositories to document data lineage and improve accessibility.
Work closely with cross-functional teams, including business analysts, data scientists, and IT teams, to ensure seamless integration of data solutions.
Create and manage data models to build interactive Tableau and Power BI dashboards to provide key insights to business users and management.