Light Dark
All posts
Building AI Agents with Hot Dev

Building AI Agents with Hot Dev

Curtis Summers, Founder
announcement ai-agents hot-chat hot-ai-agent memory rag streaming

Today we're publishing Hot Chat — a polished web chat demo, runnable in 15 minutes, that's a blueprint for integrating AI agents into your own products with Hot Dev. It ships two AI agents side by side: one whose memory follows the person across devices, one whose memory follows the channel across teammates. Switch between them live in the toolbar. The UI is a Next.js + TypeScript app that talks to Hot through our new @hot-dev/sdk.

Alongside Hot Chat, we're publishing hot.dev/hot-ai-agent 1.0.0 — a new Hot package of reusable harness primitives for AI agents: transports, commands, runtime stores, rendering, streaming, and MCP helpers. It sits on top of hot-ai and turns Hot's low-level AI building blocks into the application-shaped surface you actually build a product on. Hot Chat uses every part of it.


Try It

Before running the demo, install Hot if you don't already have the hot CLI.

git clone https://github.com/hot-dev/hot-demos
cd hot-demos/hot-chat
hot dev --open                  # terminal 1 — both agents
cp .env.example .env            # terminal 2 — the UI
# Hot App → Service Keys → New Key; paste it into HOT_API_KEY.
npm install && npm run dev

Open http://localhost:3000. The toolbar switches between the two agents live, no restart. No LLM provider keys are required for the basic experience; the demo answers from local memory by default. The live LLM path uses Anthropic for the demo, but the harness sits on hot-ai and can be wired to other providers in your app.

The full walkthrough is at /docs/demos/hot-chat.


Two Agents, One Project

Hot Chat ships two agents in one Hot project. They look nearly identical on the surface — same chat UI, same slash commands, same streaming replies — but their memory works differently, which makes them useful for very different products.

Personal Mode is identity-first. Whatever you tell the agent follows you across sessions, tabs, and devices. Type /remember I prefer launch updates that start with blockers, close the tab, come back tomorrow on a different device, ask /recall — same notes. This is the shape for assistants, journaling apps, per-user copilots, anything where memory belongs to the person, not the conversation.

Team Mode is session-first. Memory is keyed to the channel, so two people chatting in the same room share one view, and two channels stay independent. Type "we decided to ship docs before launch", then "CI is the only blocker", then /ask what is blocking launch? — the reply cites the matching records with attribution. This is the shape for team chat bots, support inboxes, shared workspaces.

The toolbar switch routes the next message to the other agent. Both agents share the same chat turn, the same streaming protocol, and the same UI; the meaningful difference is exactly the one their names imply — how each one keys its memory.

ConceptTeam ModePersonal Mode
Sessionthe channel or threada scratch context per person
Identitythe person who postedthe durable memory owner
Memoryscoped to the sessionscoped to the user
The Hot Chat web UI showing a streaming reply from the Team Mode agent with the Personal/Team mode toggle visible in the toolbar
Hot Chat, mid-conversation — the toolbar switches between Personal and Team mode live

What You Actually See

The Hot Chat UI is intentionally generic. It looks like a chat product, not a Hot demo. That's because the experience is the point:

  • Quick-prompt chips — first time you load each mode, you get a column of suggested actions. Recall preferences, Daily brief, Decisions, Ask the team. They send a slash command immediately so you can poke at the agent without learning a syntax.
  • Streaming replies — every assistant message renders as the agent generates it. Slash-command replies stream too, identically to LLM responses, so the UI doesn't have to know which is which.
  • File attachments — drag a small notes.md or screenshot onto the chat. A chip appears below the composer; the next reply notes that an attachment was carried. The agent stores the file's name and type as metadata and could be extended to parse contents.
  • Identity panel — click Identity in the toolbar to see the exact session_id and user_id the agent receives, in the same format a Slack or Telegram adapter would generate. Edit your display name; the next message picks it up.
  • Agent Graph — open the Hot Dev App and click into either agent's Graph tab. Every slash command is its own typed event handler, so you can see the shape of the agent without digging through a central dispatch function.
The Hot Dev Agent Graph view for the PersonalAgent demo showing each slash command as its own event handler node connected to the streaming reply outputs
One event handler per command, no central dispatch

What hot-ai-agent Brings

If you've built an AI chat agent before, you've probably written some version of this stack: a slash-command parser, a way to thread the LLM call through retrieval-augmented memory, a streaming reply mechanism, per-agent stores for state and stats, per-request session/identity bindings so tools know who's talking. Every chat agent reinvents these.

hot-ai-agent extracts that layer. Concretely, it gives you:

  • Typed transport messages — a single IncomingMessage shape that adapters (web, Slack, Telegram, anything) translate into. The agent never branches on transport.
  • Slash-command parsing/ask@MyBot what's up? (the Telegram-style bot suffix that disambiguates which bot in a group is being addressed) becomes {name: "ask", arg: "what's up?"}, with the @MyBot stripped. No dispatch policy baked in.
  • The memory-grounded chat turn — the canonical "recall → persist user → bind request → stream → persist assistant" lifecycle in one function call. This one matters: getting the order wrong causes a subtle bug where the user's own fresh message contaminates their own retrieval (we call it RAG self-poisoning). The helper enforces the right order.
  • Stable streaming events — every agent emits <agent>:reply:start / :delta / :end events at a stable, agent-scoped label so the client doesn't have to know which LLM provider is underneath.
  • Per-request session binding — when an LLM tool runs mid-turn, the resolved session and identity are bound to the current agent request. The model can't be tricked into writing to another user's memory by passing fake ids in the tool call.
  • Per-agent stores and a session registry — every agent gets state, stats, errors, and a notification ledger. Scheduled jobs (daily digests, weekly summaries) iterate registered sessions and fan out per-session, with per-session error isolation.
  • MCP plumbing — expose any of the agent's functions as an MCP tool with one annotation; Claude Desktop, Cursor, and other MCP clients can call into the agent directly.

What it deliberately doesn't include: transport vendor packages. No Slack, no Telegram, no Discord. Those live in the application and translate to the neutral types — keeps the harness portable and the dependency tree small.


Under the Hood, in One Snippet

When all the harness pieces are in place, an entire chat-style event handler in Hot looks like this:

remember-message
meta {
    agent: PersonalAgent,
    on-event: "personal-agent:remember",
}
fn (event) {
    d        event.data
    sender   identity-from-data(d)
    session  session-from-data(d, sender)
    input    base-input(d, session, sender, Str(or(d.text, "")))
    ::chat-turn/run-chat-turn(turn-cfg, input)
}

That's the whole handler. Resolve who's talking, package up the input, hand off to run-chat-turn. RAG, persistence, ordering, streaming, request binding, tool dispatch, error handling — all behind that one call.

Adding a new slash command in your own agent is the same shape: one more function, one more on-event annotation.


Get Started


Building on Hot? Tell us about it on X @hotdotdev — we love seeing what people make.

Enlarged image