Technical Product Manager - AI/ML

Search Atlas•San Francisco, CA

1d•$180,000 - $250,000•Hybrid

About The Position

Build Agents. Ship Intelligence. Define the Category. Location: San Francisco, CA (Hybrid) The Mission Search Atlas hit $32M ARR bootstrapped. No VC. No safety net. Just product that works. Now we're building agentic marketing - AI systems that don't report data, they execute strategies autonomously. Millions of pages crawled. Decisions made in milliseconds. Fortune 500s trusting our agents with their growth. We need a Technical PM who prototypes faster than most teams ship. You don't write tickets about AI. You build agents in Cursor, validate with SQL, and deploy to production. This is zero-to-one product craft at scale. San Francisco. In-person energy. Ship-or-die velocity. What Winning Looks Like Week 2: You've shipped a prototype agent in Claude Code that fixes technical SEO issues autonomously. It works. It's rough. It's real. Month 3: Your agent is in production serving enterprise customers. You've defined the evaluation framework, the latency budget, the failure modes. Engineers trust your specs because you've validated the approach yourself. Month 6: You're architecting the next generation of autonomous marketing systems. Other PMs study your playbooks. Your Playground You'll own one core agentic system end-to-end: OTTO - Our autonomous SEO agent. Crawls sites. Diagnoses issues. Executes fixes. No humans in the loop. Content Intelligence - Semantic engines that generate, optimize, and publish content autonomously. Brand Knowledge Graphs - AI systems that build, maintain, and leverage entity relationships at scale. The Work Architect Agent Behavior Design reasoning chains, tool use patterns, and reflection mechanisms. Your agents don't just respond - they think, act, verify. Write specs that include prompt architectures, evaluation datasets, and edge case handling. Engineers review your PRDs for technical depth. Build working prototypes in Cursor, Claude Code, or raw Python to prove viability before engineering investment. Master the Data Layer Query terabyte-scale datasets in ClickHouse and PostgreSQL. Window functions, complex joins, query optimization - you don't delegate this. Use Python (pandas, SQL Alchemy) to analyze agent performance, identify failure patterns, and propose improvements. Design evaluation pipelines: hallucination detection, citation verification, confidence scoring, human-in-the-loop triggers. Craft Dense Interfaces Figma prototypes that handle millions of data points without overwhelming users. Maximum insight per pixel. Real-time streaming updates. Interactive agent explanations. Interfaces that make complexity feel inevitable. Work shoulder-to-shoulder with engineers on React/TypeScript implementations. You don't hand off designs; you co-build. Drive 10x Velocity Lead standups that unblock, not update. Sprint planning that commits aggressively and delivers completely. QA in staging with engineering rigor. Test edge cases. Validate reasoning chains. No "throw it over the wall." Maintain 48-hour prototype-to-feedback cycles. Ship weekly, learn daily.

Requirements

You've shipped AI products that users trust - not demos, production systems.
5+ years in technical product management. 2+ years building AI/ML systems in production environments.
SQL fluency at scale. You write complex queries, optimize performance, debug data pipelines personally.
Python proficiency for analysis, scripting, and prototyping. You read code, write scripts, and validate technical approaches without engineering handholding.
AI/ML depth - you understand ReAct, tool use, reflection, chain-of-thought. You've implemented these patterns, not just read about them.
Prototyping velocity - Cursor, Claude Code, Jupyter, Streamlit. You build to learn in hours, not weeks.
Design craft - Figma at high fidelity. You care about interaction details, information density, flow states.
Engineering fluency - Git workflows, API design, basic React/TypeScript. You review PRs, not just PRDs.

Nice To Haves

Built autonomous agents (not chatbots).
SEO/MarTech domain expertise.
Frontend or data engineering background.
Vector DB/RAG at scale.

Responsibilities

Architect Agent Behavior
Design reasoning chains, tool use patterns, and reflection mechanisms.
Write specs that include prompt architectures, evaluation datasets, and edge case handling.
Build working prototypes in Cursor, Claude Code, or raw Python to prove viability before engineering investment.
Master the Data Layer
Query terabyte-scale datasets in ClickHouse and PostgreSQL.
Use Python (pandas, SQL Alchemy) to analyze agent performance, identify failure patterns, and propose improvements.
Design evaluation pipelines: hallucination detection, citation verification, confidence scoring, human-in-the-loop triggers.
Craft Dense Interfaces
Figma prototypes that handle millions of data points without overwhelming users.
Real-time streaming updates.
Interactive agent explanations.
Work shoulder-to-shoulder with engineers on React/TypeScript implementations.
Drive 10x Velocity
Lead standups that unblock, not update.
Sprint planning that commits aggressively and delivers completely.
QA in staging with engineering rigor.
Test edge cases.
Validate reasoning chains.
Maintain 48-hour prototype-to-feedback cycles.
Ship weekly, learn daily.