Data Scientist

SLBHouston, TX

About The Position

This position involves building, training, and deploying large-scale, self-supervised "foundation" models. These models are designed to learn rich representations from time series, sequential sensor data, as well as textual and vision data. The goal is to fine-tune these models for various industrial and scientific applications, including anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion.

Requirements

  • MS / Ph.D. in computer science, data science and AI or related fields
  • 3+ years of relevant experience in data science and AI or related fields
  • Expertise in Time Series & Sequential Data: processing, augmentation, feature engineering for financial, industrial, IoT, medical, or other sensor streams (univariate/multivariate time series)
  • Expertise in Sensor Data Analysis: diverse sensor modalities (e.g., accelerometers, temperature, vibration, audio, images), sampling rates, synchronization, and real-world noise/artifact handling
  • Expertise in Multi-Modality Learning: integrating heterogeneous data types (time series, images, text, audio, structured) into robust deep learning architectures; cross-modal representation learning
  • Expertise in Self-supervised and Semi-supervised Learning: time series foundation models, masked modeling, contrastive methods, temporal predictive coding, multimodal alignment and fusion
  • Proficiency with Model Architectures: sequence models (RNNs, GRU/LSTM, TCN), 1D/2D/3D CNNs, Transformers (BERT, ViT, TimeSFormer), graph neural networks, diffusion/generative models, multi-modal/fusion encoders
  • Experience with Transfer Learning & Fine-Tuning at Scale: prompt/adapter-based strategies, temporal domain adaptation, few-shot learning for specialized tasks
  • Knowledge of Evaluation Metrics: regression/classification (MSE, F1, AUC), time series similarity (DTW, correlation), event detection/segmentation (IoU, accuracy), business/end-user KPIs
  • Expert Python programming (NumPy, SciPy, Pandas)
  • Proficiency in C++/CUDA for custom kernels and high-performance preprocessing
  • Experience with Deep Learning Frameworks: PyTorch (Lightning, Distributed), TensorFlow/Keras, JAX/Flax
  • Experience with Large-scale Training: multi-GPU, multi-node clusters, mixed-precision, ZeRO optimization, scalable data loaders for long sequences
  • Skills in Data Engineering: robust pipelines for ingesting, cleaning, segmenting, and aligning large-scale, time-synchronized multi-sensor datasets
  • Strong foundation in Linear Algebra, Probability & Statistics, Optimization (stochastic, convex/non-convex, Bayesian)
  • Strong foundation in Signal Processing: Fourier/wavelet analysis, filters (Kalman, Savitzky–Golay), resampling, noise modeling
  • Strong foundation in Numerical Methods: ODE/PDE solvers, inverse problems, regularization, time-frequency methods for complex systems
  • Ability for Cross-disciplinary teamwork with domain experts, engineers, product owners, and end-users from industrial, scientific, or medical backgrounds
  • Ability for Clear presentation of complex model behaviors (interpretability, attention analysis), uncertainty quantification, and value impact

Responsibilities

  • Build, train, and deploy large-scale, self-supervised "foundation" models that learn rich representations of time series, sequential sensor data in addition to textual and vision data, to be fine-tuned for tasks such as anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion for industrial and scientific applications.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service