Agentic AI Engineer

Benchling•San Francisco, CA

3d•$176,000 - $265,000•Hybrid

About The Position

Benchling is rebuilding biotech for the AI era, aiming to compress decades of R&D work into years by integrating AI and clean, structured scientific data into scientific processes. Benchling provides an AI platform for biotech R&D, used by over 200,000 scientists globally. The company is building an AI scientist for its customers and emphasizes AI fluency as a core competency. As part of the interview process, candidates will complete an AI-focused exercise or discussion. The role is within the Intelligence Engineering & Enablement team, responsible for internal AI tooling, adoption, AI-assisted workflows, cross-functional agentic AI applications, and the underlying data infrastructure. This team acts as enablers, setting standards and shared infrastructure for departmental teams and AI power users, while also managing enterprise-grade agentic systems. The founding engineer for this team will own the technical direction, architecture, and delivery of the agentic AI portfolio, acting as a player-coach, hands-on with code, and partnering with product management and data teams. This is a senior individual contributor role focused on technical leadership, ideation, planning, delivery, and driving technical hiring, with people management handled by the hiring manager.

Requirements

7+ years of professional software engineering experience building production systems, with strong systems design fundamentals.
Hands-on experience building production systems that integrate with LLMs and/or agentic patterns: orchestration, tool use, memory and state management, evaluation, and observability.
Demonstrated understanding of how to optimize workloads across deterministic and non-deterministic capabilities, striking the right architectural balance for the needs of the specific solution being implemented.
Production experience with at least two of: Python, TypeScript/Node.js, Go; comfort with working across the stack.
Hands-on expertise with LLM APIs (OpenAI, Anthropic), agentic frameworks (LangChain, CrewAI), RAG over business content (Confluence, contracts, policies), vector databases (pgvector, Pinecone), workflow automation (n8n, Langflow), and LLM observability and evaluation tooling (LangSmith, Arize).
Track record of going from zero to one: a platform, function, or product area you built up from scratch and scaled.
Experience operating in regulated or security-sensitive environments. Solid grasp of enterprise security fundamentals — encryption, access controls, audit logging, secrets management.
Comfortable exercising technical leadership independent of positional authority. You set direction, raise the bar in design reviews, and grow other engineers through influence.
Build software with a product-first approach. You ship code quickly and care about the real-world impact of your work.
Enjoy ownership and building key pieces of platforms.
Strong communication skills with both technical and non-technical audiences. You can translate department workflows into engineering plans, and engineering tradeoffs into business language.
Interest in learning more about life science (prior knowledge is not required).

Nice To Haves

Background in enterprise SaaS, life sciences, or biotech.
Familiarity with LLM orchestration patterns and frameworks (LangGraph, MCP, agent SDKs from major model providers).
Experience with async orchestration (Temporal, Prefect, Airflow) applied to long-running or agentic workflows.
Familiarity with SOC 2, HIPAA, or GxP compliance as they apply to AI systems.
Experience building internal developer platforms or internal tools at scale.
Direct experience coaching or enabling non-engineers (analysts, ops staff, business power users) to build with AI tooling.

Responsibilities

Define the foundational architecture for enterprise agentic AI at Benchling — orchestration, agent frameworks, tool integrations (including MCP), memory and state management, evaluation, and observability. Make clear build vs. buy decisions across the stack with documented rationale.
Write production code at least half your time, particularly during the team's first year. Stand up the CI/CD, testing, evaluation, and deployment infrastructure for agentic systems — leveraging existing patterns from Benchling's Build organization wherever possible. Graduate prototypes from the AI Product Manager's discovery cycles into hardened, production-grade systems and own production support under a "you build it, you run it" model.
Build for multi-tenant isolation, secrets management, audit logging, payload encryption, role-based access controls, and human-in-the-loop controls calibrated to risk. Partner with Security Engineering on threat modeling for agentic architectures — prompt injection, tool misuse, data exfiltration vectors.
Coach power users and departmental teams on production patterns, develop the criteria that decide which prototypes graduate into enterprise-grade systems, and build the internal-facing developer experience — templates, SDKs, sandboxes — that lets builders outside this team ship safely.
Work closely with our Data, Analytics & Systems team peers on the source-of-truth datasets and pipelines that agentic systems depend on. Engage with department leaders on the workflows we're transforming, and with Benchling's platform and infrastructure teams to leverage existing capabilities rather than build parallel systems.
Set the bar for code quality, testing and evaluation, documentation, and on-call practices. Drive technical hiring through interview loop design, bar-raising in interviews, and representing the team to senior candidates. Mentor engineers on the team and other AI builders across the company.