AI & Data Engineer

Tern Travel

1d•$145,000 - $175,000

About The Position

This role builds and owns the AI and data systems at the core of Tern's product. You'll set the standard on evals, pipeline reliability, and advisor reporting on a small team where the scope is real and the ownership is yours. If you've been waiting for AI problems worth owning, this is that job. AI is becoming central to Tern's product, and the quality of those AI features lives or dies on the data behind them. This role sits right at that intersection. You'll be a senior builder on our data team, working closely with our Data / AI Lead, and your north star is making Tern's AI features trustworthy. That's an engineering problem, and you'll solve it by building: the systems those features run on, the eval harnesses that tell us whether they're actually good, and the monitoring that catches quality and drift in production. That work rests on a data foundation you'll also own. Tern is the system of record for every advisor and agency on the platform, so our data is one of the most valuable things we have, and it has to do double duty; feeding the AI features that are becoming core to the product, and powering the reporting that advisors and agency owners rely on to run their businesses. You'll own that data end to end, from the pipelines that move it through to the reporting built on top of it, so the AI work always has solid ground to stand on.

Requirements

Production AI/ML experience- the must-have: You've built and shipped AI and/or ML systems that real users depended on in production, and you owned what happened after launch. Watching quality, debugging bad outputs, and making the system better over time. This matters more to us than any specific tool or title.
Evals as engineering: You treat evaluation as something you build, not a report you write. You have a real point of view on what to measure, how to catch drift and regressions, and when a metric is lying to you ideally from building evals for a production system.
Data pipeline and service ownership: You've personally built and owned pipelines and services that move data from application sources into a warehouse. You know what breaks, when and why, and you own the fix.
High agency: You've taken ambiguous, under-specified problems and driven them to a working outcome. You don't need a fully-scoped ticket to start.

Nice To Haves

Experience with Ruby on Rails or working directly from an application database rather than just downstream data
Hands-on experience with LLM evaluation and observability tooling
Experience with MCP-based tooling or agentic data workflows

Responsibilities

Build and ship the systems behind Tern's AI features: the data infrastructure, services, and agentic tooling they run on. The output of this role is working software in production, not decks or recommendations.
Build the evaluation systems that answer whether an AI feature is good enough to ship and good enough to keep- eval harnesses, datasets, and production monitoring that run as software, not one-off analyses. Where no quality bar exists yet, build the thing that sets it.
Own data quality as a first-class concern across ingestion, modeling, and reporting. Catch problems before they reach a model, a dashboard, or a user. Fix them end to end.
Build and maintain the ETL systems that move and shape data from our application and third-party sources. Keep them reliable as volume grows.
Work alongside product squads to build the reporting that gives advisors and agency owners real visibility into how their business is performing.
Make the people around you faster and better. Share context early and write clearly so others can build on your work.