AI Research Engineer, Inference

Hudson River Trading•New York, NY

4h•$250,000 - $300,000

About The Position

Hudson River Trading (HRT) is seeking an AI Research Engineer (Inference) to join the HAIL team. HAIL (HRT AI Labs) is the team at HRT responsible for developing and maintaining our most powerful models, which are used by our trading teams to drive a significant fraction of our trading. We are building and deploying "foundation models for markets", that ingest and train on vast amounts of market and “alternative” data (such as language) to make predictions about future market state. As an inference research engineer on HAIL, you will have a general mandate to improve all aspects of large-scale model inference, including but not limited to GPU kernel development, novel inference devices like ASICs and FPGAs (both off-the-shelf and in-house), and data streaming. You will work closely with our researchers to co-design and improve our models, and shape the research agenda. We have a complex inference solution involving a variety of devices deployed around the world to meet a variety of trading needs. We are simultaneously pursuing multiple strategies and developing many model types with different purposes, and we are strongly incentivized to squeeze as much as we can out of our systems. Your work will be directly, clearly, and highly impactful on the business, and it will be challenging: this is a field with no easy or obvious solutions.

Requirements

Strong engineering skills, especially any of: CUDA/Triton/Pallas/CuTe DSL kernel development, lower-level PyTorch/JAX/XLA development, CUDA Graphs, FPGA/ASIC experience
Two or more years of work experience building deep learning systems for any domain (robotics, biology, chemistry, physics, audio, video, recommendations, etc.)
Experience translating methods between areas of application is highly valued

Nice To Haves

LLM experience is valuable, but not necessary
Finance experience is not required

Responsibilities

Improve all aspects of large-scale model inference, including GPU kernel development, novel inference devices like ASICs and FPGAs, and data streaming.
Work closely with researchers to co-design and improve models.
Shape the research agenda.