Introducing fishmem: open-source memory for AI agents

AI agents forget everything the moment a session ends. Today we're open-sourcing fishmem — a memory engine that extracts durable facts from conversations and recalls them with hybrid graph + vector search. Free to self-host, with a hosted cloud for when you'd rather not run it yourself.

Your agent forgets everything

Large language models are stateless. Ask one a question today and again tomorrow and you get the same generic answer — it has no idea who you are, what you told it last time, or what it worked out about your codebase an hour ago. The moment a session ends, the context is gone.

Teams paper over this by stuffing whole conversation histories back into the prompt, or by reaching for RAG. But RAG is a general-purpose search tool over static documents, and memory is not static. Facts change. People move cities, switch stacks, change their minds. A store that doesn't track when something was true will confidently hand your agent a stale fact.

Memory deserves its own primitive. So we built one.

Introducing fishmem

fishmem is an open-source memory engine for AI agents. Give it a conversation and it extracts the durable facts worth keeping, stores them on a hybrid graph + vector layer, and hands them back when they're relevant — across sessions, weeks, and millions of memories.

The same engine powers two ways to use it:

Self-host the framework. fishmem is open source. Run it on your own infrastructure, read every line, and own your data.
Use the hosted cloud. FishMem Cloud runs the same engine behind a mem0-compatible REST API, with a dashboard, API keys, usage metering, and billing — so you never operate a database, a vector index, or key management yourself.

It's open core: the cloud runs the code you can read on GitHub, so you're never locked into our hosting to keep your memory.

How it works

Writing a memory

When you add a conversation, fishmem doesn't just dump the transcript. It extracts atomic, durable facts with an LLM, then reconciles them against what it already knows — adding new facts, updating changed ones, and invalidating the ones they supersede. The result is a clean memory, not an append-only log.

Each fact lands on two stores at once: a graph store for entities and the relationships between them, and a vector index for semantic similarity.

Recalling a memory

Retrieval is hybrid. fishmem blends four signals — vector similarity, keyword match, temporal ordering, and graph diffusion (relevance spreading across connected entities) — so the right fact surfaces at the right moment, not just the most lexically similar one.

And every fact is bi-temporal: fishmem records both when a fact was true and when it learned it. When a user moves from Lisbon to Berlin, the old address is superseded, not deleted. The engine knows what's true now and what used to be true — so your agent stops answering with yesterday's facts.

A few lines to remember

import { FishMem } from "fishmem";

const memory = new FishMem();

// Remember something about a user
await memory.add(
  [{ role: "user", content: "I prefer dark mode and write TypeScript." }],
  { userId: "alex" },
);

// Recall it later, in a brand-new session
const results = await memory.search("what does alex like?", {
  userId: "alex",
});

On the hosted cloud it's the same idea over HTTP. The API is mem0-compatible, so migrating is a config change rather than a rewrite — in both directions:

curl https://fishmem.com/v1/memories \
  -H "Authorization: Bearer fm_..." \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"I moved to Berlin."}],"user_id":"alex"}'

How well does it recall?

We benchmarked fishmem against mem0 OSS (TypeScript) on a shared harness: 233 held-out questions, a frozen config with no per-benchmark tuning, the same prompts, and the same LLM judge for both systems. The harness, prompts, and judge are open-sourced so you can rerun every number yourself.

Overall recall: 62.2% vs 54.5% LLM-judge accuracy.
Temporal questions: 69.8% vs 28.3% — a 41.5-point gap, and where bi-temporal tracking pays off most.
Multi-hop questions: 54.8% vs 57.1% — mem0 is ahead here, and we're publishing it anyway.
Search latency: p50 around 420ms versus 2.4–3.8s for a leading hosted alternative, measured from the same client — roughly 6–9× faster on every turn of the agent loop.

Memory lives inside the agent loop, so retrieval latency compounds every turn — we treat it as a product feature, not an implementation detail. And we'd rather show you a benchmark we lose than one you can't reproduce.

Who it's for

Support agents that recall past tickets and resolutions across sessions.
Coding agents that persist decisions and conventions instead of re-learning the codebase.
Personal assistants that track changing preferences over months.
Sales copilots that connect contacts, objections, and commitments across calls.
Companions that stay coherent over long conversations without stuffing history into context.

Get started

fishmem is open source at github.com/fishmem-labs/fishmem. Star it, read it, run it.

If you'd rather not operate memory infrastructure, grab a hosted API key — the Hobby plan is free, no card required — and point your existing mem0 client at it. The docs have the full API.

Agents are getting good at reasoning. It's time they got good at remembering.