Senior AI Agent Engineer Job

Planera

2d•Remote

About The Position

Join Planera to build Manny, our AI scheduling assistant, and shape how construction schedulers work with AI on a modern Critical Path Method platform. You will own agent features end to end: designing and evolving the LangGraph/LangChain agent, engineering prompts and tools, integrating LLMs across providers, and holding response quality to a high bar with a real evaluation and observability stack. This is a hands-on applied AI role with a strong software engineering foundation and a focus on reliability, behavior quality, and user impact. You will work directly with the CTO and the lead AI engineer.

Requirements

4+ years of software engineering experience, including recent hands-on work building production LLM features.
Strong proficiency in Python building production services
Hands-on experience building agentic systems with LLMs: tool and function calling, ReAct or similar loops, and orchestration frameworks such as LangChain/LangGraph
Practical prompt engineering skill: shaping model behavior reliably, debugging failures from traces, and managing large prompts and token cost
Experience evaluating LLM systems: building datasets, writing evaluators, catching regressions, and using tracing and observability tooling
Experience with the Model Context Protocol (MCP) or building tool and function-calling integrations for LLMs
Solid understanding of API design (REST, websockets, SSE and streaming) and interservice communication
Product mindset with a focus on user impact and pragmatic tradeoffs
Excellent remote communication skills

Nice To Haves

Experience with MongoDB and Redis
Cloud experience (AWS or GCP), containers, and CI/CD
Go experience, as most of our backend systems are written in Go, including the MCP tool server
Practical experience with retrieval and augmentation (RAG), embeddings, and vector stores
Familiarity with LangSmith or comparable LLM evaluation and tracing platforms
Frontend or React familiarity for agent UI work
Domain knowledge in construction tech, project management, or scheduling

Responsibilities

Design, build, and own Manny features end to end across the agent backend, tools, and UI
Improve agent behavior, reliability, and answer quality through prompt engineering, tool design, and changes to the agent control flow
Evolve the agent architecture: ReAct loop, routing and controller logic, multi-node graphs, tool selection, and streaming responses
Integrate and tune LLMs across providers (Anthropic, OpenAI, Google), balancing quality, latency, and cost, including prompt caching and model selection
Design and extend Manny's tool surface through the MCP server that connects the agent to Planera's scheduling services
Build and own the evaluation loop: golden datasets, automated evaluators, snapshot-based replay, and offline and online quality metrics
Implement observability for agent runs with tracing, metrics, and structured logging, and use it to debug and improve behavior in production
Ensure safe, sandboxed execution of model-generated code and safe handling of tool side effects and mutations
Collaborate with product, backend, and frontend to deliver AI features end to end