Machine Learning Engineer

Virtu FinancialNew York, NY
17h$200,000 - $300,000

About The Position

Virtu’s Research Technology team is looking for an experienced Machine Learning Engineer to join a small group of technologists whose primary function is building the infrastructure that powers our quantitative researchers. This is a unique opportunity to work at the intersection of machine learning and systematic trading — building tools that directly determine how fast our researchers can move, and how effectively our GPU cluster translates into research output. In this role, you will be responsible for the development of our ML research platform: the systems that manage data and compute, track experiments, and enable researchers to go from idea to result as efficiently as possible. You will work closely with quants and engineers alike and will play a central role in shaping how ML is done at the firm as we scale our capabilities. We mostly use Python, C++ and Java with a variety of open-source tools along with proprietary solutions.

Requirements

  • 5+ years of experience in ML engineering, research infrastructure, or HPC environments
  • Strong Python engineering skills — you write clean, maintainable, well-tested code that other engineers want to build on. Exposure to C++ in a performance-sensitive context is a plus
  • Experience building or operating distributed training infrastructure, with working knowledge of how collective communication libraries (NCCL, Horovod, or similar) behave at scale
  • Practical experience with experiment tracking systems and strong opinions about what good research infrastructure looks like
  • Comfort working across the Linux systems stack — storage, networking, job scheduling — enough to follow a problem wherever it leads
  • Excellent communication skills and the ability to work closely with researchers and engineers across disciplines
  • Intellectually curious and self-driven — you proactively identify problems worth solving, not just problems you've been asked to solve

Nice To Haves

  • Experience with on-prem compute environments and job orchestration tools such as Slurm
  • Familiarity with GPU profiling tools (NSight Systems, PyTorch Profiler) and hands-on experience optimizing GPU memory or compute utilization
  • Experience with columnar data formats and high-performance data processing tools such as Parquet, Arrow, and Polars
  • Familiarity with workflow orchestration tools (Prefect, Dagster, or similar)
  • Prior experience in environments with high-stakes, time-series data at scale. Open to Quantitative Finance, Algorithmic Trading, and Other
  • Experience contributing to or extending open-source ML frameworks or infrastructure tooling

Responsibilities

  • Design and build experiment tracking, job orchestration, and reproducibility infrastructure so researchers can iterate quickly, compare runs reliably, and recover from failures without losing work
  • Create tools for all stages of the simulation lifecycle including historical back-tests and production monitoring. Add new features to our simulators
  • Own visibility into GPU cluster utilization — track allocation, surface bottlenecks, and ensure our compute investment is being used effectively
  • Diagnose and resolve performance issues across training pipelines: data loading throughput, storage I/O, GPU utilization, and inter-node communication in distributed training runs
  • Build and maintain data pipelines that move financial data from storage into training workflows efficiently, with strong guarantees on correctness and versioning
  • Develop feature storage and retrieval patterns that support fast, reproducible access to training data at scale
  • Work directly with researchers to understand friction in their workflows, and build solutions that reduce it — from tooling improvements to infrastructure changes
  • Collaborate with existing infrastructure engineers on capacity planning, cloud/on-prem tradeoffs, and tooling decisions — this is a collaborative environment, not a siloed one
  • Stay current with developments in ML infrastructure tooling and bring relevant ideas and tools into our stack where they create genuine value
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service