Artificial Intelligence Research Engineer

Artificial Intelligence Underwriting Company•San Francisco, CA

About The Position

As an AI Researcher at AIUC, you will develop and expand the evaluation methods that sit at the heart of our work. You will identify the most pressing problems in our evaluation stack, scope and lead projects to address them, and push the frontier of what rigorous AI assessment looks like. Your work spans three horizons. On the product side, you'll improve our scale and accuracy by building better LLM judges, tightening our pipelines, making our evaluations faster and more reliable. On the more pure research side, you'll deepen the quality of what we evaluate. This means designing new attack vectors, implementing techniques from the latest research, and building agentic automations that extend our capabilities. And on the longer horizon, you'll take on moonshot projects: things like fully dynamic attackers, self-expanding libraries of attacks, and novel approaches to evaluation that don't yet exist.

Requirements

Deep AI industry knowledge: You understand the techniques behind language model agents, LLM pipelines, and agentic systems. You know the tools, the players, and the direction the field is moving.
Strong technical foundation: Solid coding ability and statistical analysis skills. You're comfortable implementing research and building systems that others depend on.
Clear communication: You can collaborate effectively across engineering, product, and client-facing teams. You translate complex technical work into language that lands with any audience.
Research leadership: You've managed complex projects end-to-end and can coordinate across teams of researchers with clarity and conviction.
Culture fit: You are genuinely motivated by AIUC's mission. You thrive in a high-intensity environments and bring openness, honesty, and emotional maturity to the team.

Nice To Haves

Hands-on experience with AI red teaming, evals, post-training, or alignment research.
Prior experience in a fast-moving startup environment.
Published research or technical writing in AI safety, security, or related fields.
Best fit: A research background combined with product instincts and experience shipping real systems not just writing papers about them.

Responsibilities

Identify and scope the highest-leverage problems in our evaluation system, then lead projects end-to-end to address them.
Build novel approaches to AI evaluation by implementing research papers, replicating attack techniques, and experimenting with new methods.
Lead and coordinate research teams, managing complex multi-person projects with clear ownership and delivery.
Communicate findings internally and externally through technical blog posts, papers, and direct engagement with client partners.
Feed insights back to the product, shaping our roadmap based on what you learn on the frontier.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume