ML Research Scientist -Deep Learning & Transformer Architectures

MillenniumNew York, NY
$150,000 - $200,000

About The Position

As part of a long-term research agenda within a newly formed systematic equities pod, we are building a proprietary Transformer-based model trained on tokenized intraday market data for next-token prediction of price movements. We are seeking an exceptional ML research scientist with deep expertise in Transformer architectures and large-scale model training. You will design, implement, and train a custom decoder-only Transformer from scratch - not fine-tune an existing LLM, but build a purpose-built architecture for financial time-series. This is a long-term research project with significant computational resources. The successful candidate will have a PhD in machine learning or a related field and demonstrated ability to implement Transformer architectures from first principles.

Requirements

  • PhD in Machine Learning, Computer Science, Statistics, Applied Mathematics, or a related field with a focus on deep learning
  • Demonstrated ability to implement Transformer architectures from scratch (not just fine-tuning pre-trained models)
  • Deep understanding of attention mechanisms, positional encodings, tokenization strategies, and training dynamics
  • Expert-level PyTorch skills including custom modules, training loops, mixed-precision, and multi-GPU training
  • Strong mathematical foundations: linear algebra, probability theory, optimization, information theory
  • Experience training models at scale (100M+ parameters)
  • Strong programming skills in Python and C++ for performance-critical components
  • Self-directed researcher capable of defining and executing a multi-month research agenda
  • Familiarity with AI-assisted development tools (Cursor, Claude Code)

Nice To Haves

  • Experience applying deep learning to financial data or time-series forecasting
  • Familiarity with tokenization approaches for continuous or non-text data
  • Published research in top ML venues (NeurIPS, ICML, ICLR) or equivalent industry experience
  • Knowledge of market microstructure and intraday trading dynamics
  • Experience with model compression, quantization, and inference optimization

Responsibilities

  • Design and implement a custom decoder-only Transformer architecture optimized for tokenized financial time-series data
  • Develop a novel tokenization scheme for intraday market data: price movements, volume, order flow, and cross-sectional features
  • Implement efficient training pipelines using PyTorch with mixed-precision training, gradient checkpointing, and multi-GPU parallelism
  • Design attention mechanisms adapted to financial data: temporal attention patterns, cross-asset attention, and multi-scale representations
  • Build evaluation frameworks for next-token prediction accuracy, signal quality, and trading performance
  • Implement inference optimization for low-latency production deployment: model quantization, KV-cache, speculative decoding
  • Conduct rigorous ablation studies to validate architecture choices and training methodology
  • Collaborate with the team to integrate model predictions into the live trading pipeline
  • Document research methodology, experimental results, and architectural decisions

Benefits

  • Base salary
  • Discretionary performance bonus
  • Comprehensive benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service