The Data Science Lead will serve as the strategic architect and research pioneer for the organization’s data ecosystem. This role is responsible for designing robust data architectures, leading research and development (R&D) for novel data sources, establishing rigorous analytical methodologies, and ensuring the seamless, scalable ingestion of high-quality data into downstream production solutions. Core Pillars of Responsibility 1. Data Architecture & Scalable Engineering Blueprint Design: Design and oversee the evolution of scalable data architectures that support advanced analytics, machine learning (ML) modeling, and real-time processing. 2. R&D & Novel Data Source Evaluation Exploratory Research: Scout, evaluate, and pressure-test new internal, external, and alternative data sources (e.g., synthetic data, IoT streams, third-party APIs) for predictive power and commercial viability. Lead the ideation and feature engineering for these data sources and document how it aligns to current and future data architecture designs. Proof of Concepts (PoCs): Lead rapid prototyping and PoCs to validate new technologies, algorithms, and data structures before scaling them to production. Vendor & Partner Assessment: Technical vetting of data vendors and partners to ensure data quality, density, and seamless integration capabilities. 3. Methodology & Analytical Rigor Framework Standardization: Define and document the organization's gold-standard methodologies for statistical analysis, experimental design (A/B testing), and ML modeling. Evaluation Metrics: Establish rigorous validation protocols and evaluation metrics (e.g., precision/recall, drift detection, bias/fairness audits) to ensure model and data integrity. Continuous Improvement: Keep the organization at the cutting edge of data science by translating academic research and emerging industry trends into practical business methodologies. 4. Ingestion & Solution Integration Productionalization Bridge: Serve as the critical bridge between R&D and Production, ensuring that complex analytical models and data sources are seamlessly ingested into core business products and solutions. API & Interface Design: Oversee data delivery contracts between the DS ecosystem and downstream software applications to ensure the creation of clean, well-documented APIs. Key Deliverables (First 12 Months) Data Source Playbook: A formalized framework for scoring, vetting, and onboarding new data assets. Methodology Registry: A centralized repository of approved statistical models, evaluation metrics, and ingestion protocols to ensure team-wide consistency. Feature Importance Registry & Feature Engineering Roadmap: a centralized repository connecting current data sources to their product value and impact of removal and/or possible substitutes to the roadmap of how Prove can leverage the signals in new and differentiated ways Architectural Roadmap: A 12 month to 3-year vision aligning data science infrastructure with corporate scaling goals.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed