Senior AI Engineer

HyperFi

About The Position

We're looking for a Senior AI Engineer to own our AI program end-to-end. Not a prompt engineer. Not a data engineer. The person who owns how our models get selected, trained, tuned, routed, and evaluated — and who walks in with the confidence to define the architecture from the hardware up. You'll work directly with the CTO, lead our fine-tuning strategy (LoRA is going to be core for us), and decide how we get the most out of our GPU spend. This is a senior IC role in a flat org — no management required, but you'll be the technical anchor other engineers learn from.

Requirements

8+ years building production-grade ML, data, or AI systems
Hands-on experience training and fine-tuning models — LoRA, QLoRA, adapter methods, or full fine-tunes. Actual model work, not just prompt iteration
Confidence to define GPU architecture given a goal and a budget — hardware choices, training strategy, cost/performance tradeoffs
Strong grasp of prompt engineering, context construction, and retrieval design
Comfortable working in LangChain and building agents, not just chains
Strong Python: testable, maintainable, clearly structured
Understanding of model evaluation, observability, and feedback loops
Excited to push from prototype → production → iteration
Senior IC judgment: you scope your own work, push back when it's right, and make calls others can build on
Confident English skills to collaborate clearly and effectively with teammates

Nice To Haves

Have shipped a fine-tuned model into production and can walk us through the tradeoffs you made
Have built agent-like workflows with LangGraph or similar
Have worked on semantic chunking, vector search, or hybrid retrieval strategies
Can walk us through a real-world model or prompt failure — and how you fixed it
Have experience with PySpark, Databricks, or lakehouse architecture
Have contributed to OSS tools or internal AI platforms
Think of yourself as both an engineer and a systems designer
Have mentored other senior engineers and enjoyed it

Responsibilities

Own our fine-tuning strategy end-to-end — LoRA first, full fine-tunes where they earn it. What we tune, on what hardware, against what evals
Define the GPU architecture. We have working infrastructure (H100 for production, A100 for training); your call to confirm, reshape, or rebuild
Drive model selection and routing across Gemini, Anthropic, and OpenAI — the right model for the right job, with cost and latency in the equation
Build agentic LLM pipelines using LangChain, LangGraph, and LangSmith
Design and iterate on prompt strategies, with a focus on consistency and context
Construct retrieval-augmented generation (RAG) systems from scratch
Instrument evaluation metrics, telemetry, and feedback loops to guide model and prompt evolution
Work alongside product, frontend, and backend engineers to tightly integrate AI into user-facing flows