Data Engineer

Waystation AIRedwood, CA
Onsite

About The Position

The owner of the data layer the entire product is built on — from raw supplier email to structured system of record. Waystation is building the operating system for procurement in consumer packaged goods (CPG). Today, ingredient and packaging sourcing still runs through inboxes, PDFs, and spreadsheets. It's slow, opaque, and costly. Waystation replaces that chaos with an AI-powered procurement platform that creates structure, visibility, and leverage — without forcing suppliers into portals. The result: real ROI. One customer saved over $200,000 in the first three months, paying for their annual contract in the first 30 days. Waystation is led by repeat founder Ryan Caldbeck (previously founded CircleUp) and backed by Founder Collective, Homebrew, Slow Ventures, 87 Capital, Floodgate, and SuccessVP. We have paying customers, real usage, and a product that works. Structured data isn't a feature of our product — it is the product. We take the messiest input imaginable (hundreds of thousands of disconnected supplier emails and PDFs — specs, COAs, pricing, certs) and turn it into a clean, queryable system of record shared across procurement, QA, and R&D. You own that layer end to end. The extraction pipeline, the data model, the infrastructure the rest of engineering builds on — it's yours, not a slice of it. The quality of what every user sees, what every model trains on, and what every customer ROI claim rests on flows through what you build. No one will hold your hand. You'll have unusual access and unusual scope, and you'll be expected to use both. You'll move fast and ship scrappy — a rough system working today beats a perfect one next quarter. We don't have the resources to gold-plate, and neither do you.

Requirements

  • Have built in the chaos — required. You've done real work at an early-stage startup (seed or Series A), where there was no playbook, no infrastructure handed to you, and never enough hours. You know the difference between building from zero and maintaining someone else's system. A purely big-company background isn't a fit for this seat.
  • Move fast and stay scrappy. You ship, learn, and iterate in the open rather than polishing in private. Constraints — fewer people, less tooling, no time — energize you instead of stalling you. You find the version that works now and earn the polish later.
  • Have one superpower. There's a thing you're genuinely better at than almost anyone — data systems, extraction, ML pipelines — and you can name it and point to results that prove it. A sharp edge and the slope to outgrow the job, not evenly good at everything.
  • Have real depth. 8+ years building production data systems.
  • Deep with Python, SQL, and modern data tooling. You can architect a system as easily as you can ship a fix — and you do both at startup speed.
  • Own whole problems. You take messy things start to finish and close them without being asked. When the data is wrong, you fix the system, not the symptom.
  • Build leverage. You reach for tools, automation, and agents to scale yourself instead of grinding manually. We live in Claude Code — you should want to, too.
  • Are all in. This is a rocket ship you want to plant a flag on and ride through the messy middle — not a stepping stone. We're betting on you; we need you betting on us.
  • Have grit. You've ground at something hard for a long time, through the part where it stopped being fun and the feedback loop ran far longer than your next review. You don't flinch when the work gets ugly.

Nice To Haves

  • document extraction / ML / NLP pipelines a strong plus
  • Bonus: document extraction, NLP, or ML pipelines; regulated document-heavy domains; CPG, supply chain, or procurement; multi-language data (Chinese, Spanish).

Responsibilities

  • Own the extraction pipeline. Turn messy supplier emails and documents — specs, COAs, pricing, certs, multi-language, bad scans — into structured, validated data.
  • Push accuracy and prove it. Drive extraction past today's 85%+ and build the eval harness that measures it, per document type, so the number is real and not a vibe.
  • Own the data model. Unify suppliers, documents, RFPs, pricing, and certifications into one source of truth — and build for institutional memory, so every email compounds into leverage.
  • Build infrastructure others depend on. Ship reliable, observable pipelines and own data quality, lineage, and the monitoring that catches problems before customers do.
  • Treat extraction as an ML problem. Eval sets, regression testing, accuracy tracking over time — turn customer-reported errors into systematic improvements, not one-off patches.
  • Build leverage. Reach for models and agents first. Automate the long tail instead of grinding it.

Benefits

  • Competitive base salary + meaningful equity — real ownership, with upside tied to the outcomes you drive
  • Ownership of the data layer the entire product is built on, working directly with a repeat founder & CEO — a front-row seat to how an AI-native company gets built
  • A real product with real ROI — value you can measure
  • Full health, dental, and vision coverage
  • Unlimited vacation — we care about outcomes, not hours
  • An in-person team that values craft and ambition
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service