Senior Data Scientist

Evolv Technologies Holdings•Waltham, MA

14h•$129,000 - $209,000•Onsite

About The Position

Join the Evolv Machine Learning & Sensors team as a Data Scientist focused on driving deep understanding of sensor data, feature spaces, and data quality that power our AI/ML systems. This hands-on role emphasizes representation analysis, exploratory data insights, and data-centric improvements that directly enhance model accuracy, robustness, and generalization. You will work across classical ML and deep learning pipelines to identify blind spots, diagnose data issues, and guide data curation and collection strategies. Success in the Role: What performance outcomes will you work toward in the first 6–12 months? In the first 30 days: Develop a strong understanding of Evolv’s sensor ecosystem, datasets, and ML pipelines. Review dataset structure, labeling processes, and existing exploratory data analyses. Run initial UMAP/PCA/t-SNE analyses to map data distributions and identify anomalies. Identify opportunities to improve data quality, labeling consistency, and dataset coverage. Within the first three months: Perform deep representation analysis across sensor, time‑series, and feature data. Evaluate classical ML and deep learning models by linking model errors to data issues. Define data quality metrics and initial dataset acceptance criteria. Collaborate with data collection teams to guide targeted data acquisition and relabeling. Data mining on existing field data and understanding patterns and extract useful information and insights Design methods to improve data quality, converting noisy/unverified data into clean/verified data By the end of the first year: Own data‑centric insights that directly improve ML model performance. Establish ongoing monitoring of data drift, blind spots, and label quality. Provide strategic guidance for future data collection, annotation, and curation. Develop automated tools and dashboards for data quality reporting and representation analysis. The Work: What type of work will you be doing? What assignments, requirements, or skills will you be performing on a regular basis? Data Understanding & Representation Analysis: Analyze high‑dimensional sensor and feature data using UMAP, t‑SNE, PCA, and related techniques. Identify clusters, outliers, distribution gaps, and blind spots across classes and environments. Diagnose dataset shift, domain mismatch, sparsity, and representation collapse. Model‑Aware Data Analysis: Conduct data analysis aligned with both classical ML models (XGBoost, SVR, k‑NN, tree‑based models) and deep learning models (CNNs, Transformers). Analyze embeddings, confusion matrices, and failure cases to map model issues back to data causes. Data Quality & Curation: Investigate imbalanced data, noisy sensor signals, and mislabeled or ambiguous samples. Develop strategies for weakly labeled or unlabeled data using clustering or pseudo‑labeling. Define data quality metrics, acceptance criteria, and labeling strategies. Work with internal teams and external vendors to improve label consistency and coverage. Insight‑Driven Improvements: Translate exploratory insights into clear recommendations for data collection, relabeling, or filtering. Drive data‑centric improvements instead of relying solely on algorithmic changes. Track KPIs such as data quality, data quantity, collection rate, and utilization efficiency. Collaboration & Communication: Work closely with internal and external data collection teams to refine data pipelines. Communicate findings through visualizations, reports, and technical deep‑dives.

Requirements

Master’s or PhD in Data Science, Computer Science, Applied Mathematics, Statistics, Physics, or related field.
2-3+ years of data science experience working with real‑world ML datasets (time‑series, images, video, sensors).
Proficiency in Python and data science libraries (NumPy, pandas, matplotlib, seaborn).
Hands‑on experience using UMAP, t‑SNE, PCA, or other representation analysis methods.
Experience analyzing data for both classical ML and deep learning models.
Strong understanding of ML fundamentals and model evaluation methodologies.

Nice To Haves

Experience with sensor or time‑series data (magnetic, radar, 3D, environmental, IoT).
Familiarity with scikit‑learn workflows and preprocessing techniques.
Experience addressing imbalanced datasets, label noise, and data drift.
Knowledge of embedding analysis, feature importance, and model interpretability.
Experience collaborating with annotation or data collection teams.
Familiarity with MLOps or data versioning tools (MLflow, W&B, DVC).

Responsibilities

Analyze high‑dimensional sensor and feature data using UMAP, t‑SNE, PCA, and related techniques.
Identify clusters, outliers, distribution gaps, and blind spots across classes and environments.
Diagnose dataset shift, domain mismatch, sparsity, and representation collapse.
Conduct data analysis aligned with both classical ML models (XGBoost, SVR, k‑NN, tree‑based models) and deep learning models (CNNs, Transformers).
Analyze embeddings, confusion matrices, and failure cases to map model issues back to data causes.
Investigate imbalanced data, noisy sensor signals, and mislabeled or ambiguous samples.
Develop strategies for weakly labeled or unlabeled data using clustering or pseudo‑labeling.
Define data quality metrics, acceptance criteria, and labeling strategies.
Work with internal teams and external vendors to improve label consistency and coverage.
Translate exploratory insights into clear recommendations for data collection, relabeling, or filtering.
Drive data‑centric improvements instead of relying solely on algorithmic changes.
Track KPIs such as data quality, data quantity, collection rate, and utilization efficiency.
Work closely with internal and external data collection teams to refine data pipelines.
Communicate findings through visualizations, reports, and technical deep‑dives.

Benefits

Equity as part of your total compensation package
Medical, dental, and vision insurance
Health Savings Account (HSA)
A 401(k) plan (and 2% company match)
Flexible Paid Time Off (PTO)- take the time you need to recharge, with manager approval and business needs in mind
Quarterly stipend for perks and benefits that matter most to you
Tuition reimbursement to support your ongoing learning and development
Subscription to Calm

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume