Data Scientist

SLB•Houston, TX

20h

About The Position

This position involves building, training, and deploying large-scale, self-supervised "foundation" models. These models are designed to learn rich representations from time series, sequential sensor data, as well as textual and vision data. The goal is to fine-tune these models for various industrial and scientific applications, including anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion.

Requirements

MS / Ph.D. in computer science, data science and AI or related fields
3+ years of relevant experience in data science and AI or related fields
Expertise in Time Series & Sequential Data: processing, augmentation, feature engineering for financial, industrial, IoT, medical, or other sensor streams (univariate/multivariate time series)
Expertise in Sensor Data Analysis: diverse sensor modalities (e.g., accelerometers, temperature, vibration, audio, images), sampling rates, synchronization, and real-world noise/artifact handling
Expertise in Multi-Modality Learning: integrating heterogeneous data types (time series, images, text, audio, structured) into robust deep learning architectures; cross-modal representation learning
Expertise in Self-supervised and Semi-supervised Learning: time series foundation models, masked modeling, contrastive methods, temporal predictive coding, multimodal alignment and fusion
Proficiency with Model Architectures: sequence models (RNNs, GRU/LSTM, TCN), 1D/2D/3D CNNs, Transformers (BERT, ViT, TimeSFormer), graph neural networks, diffusion/generative models, multi-modal/fusion encoders
Experience with Transfer Learning & Fine-Tuning at Scale: prompt/adapter-based strategies, temporal domain adaptation, few-shot learning for specialized tasks
Knowledge of Evaluation Metrics: regression/classification (MSE, F1, AUC), time series similarity (DTW, correlation), event detection/segmentation (IoU, accuracy), business/end-user KPIs
Expert Python programming (NumPy, SciPy, Pandas)
Proficiency in C++/CUDA for custom kernels and high-performance preprocessing
Experience with Deep Learning Frameworks: PyTorch (Lightning, Distributed), TensorFlow/Keras, JAX/Flax
Experience with Large-scale Training: multi-GPU, multi-node clusters, mixed-precision, ZeRO optimization, scalable data loaders for long sequences
Skills in Data Engineering: robust pipelines for ingesting, cleaning, segmenting, and aligning large-scale, time-synchronized multi-sensor datasets
Strong foundation in Linear Algebra, Probability & Statistics, Optimization (stochastic, convex/non-convex, Bayesian)
Strong foundation in Signal Processing: Fourier/wavelet analysis, filters (Kalman, Savitzky–Golay), resampling, noise modeling
Strong foundation in Numerical Methods: ODE/PDE solvers, inverse problems, regularization, time-frequency methods for complex systems
Ability for Cross-disciplinary teamwork with domain experts, engineers, product owners, and end-users from industrial, scientific, or medical backgrounds
Ability for Clear presentation of complex model behaviors (interpretability, attention analysis), uncertainty quantification, and value impact

Responsibilities

Build, train, and deploy large-scale, self-supervised "foundation" models that learn rich representations of time series, sequential sensor data in addition to textual and vision data, to be fine-tuned for tasks such as anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion for industrial and scientific applications.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume