December 22, 2025

#AI #agents #memory #hindsight #retrieval

Document upserting: keeping evolving conversations fresh

TL;DR: Memory systems that only append create duplicates when information changes. Hindsight’s document_id parameter enables upsert semantics - same ID updates existing memories, new ID creates new ones. This keeps your memory bank consistent without manual deduplication.

The Append-Only Problem

Most memory systems work like append-only logs. Every retain() call adds new memories. Nothing ever updates.

This breaks down fast in real conversations:

Day 1: User says "I work at Acme Corp"
Day 5: User says "I just started at NewStartup"

With append-only, you now have two conflicting facts. Query “Where does the user work?” and you might get either answer - or worse, both. The system has no way to know that the second statement supersedes the first.

In my experience, this is one of the first issues that surfaces when you deploy an agent with long-term memory. Information changes. People update their preferences. Projects evolve. Facts become outdated.

Document ID as Upsert Key

Hindsight solves this with document_id - an optional identifier you pass to retain(). The behavior is simple:

Same document_id: Completely replaces all memories from the previous call
New document_id: Creates new memories (normal append behavior)
No document_id: Auto-generates one, effectively always append

 1from hindsight_client import Hindsight
 2
 3with Hindsight(base_url="http://localhost:8888") as client:
 4    # First retention - creates memories
 5    client.retain(
 6        bank_id="user-profile",
 7        document_id="employment-info",
 8        messages=[
 9            {"role": "user", "content": "I work at Acme Corp as a senior engineer"}
10        ]
11    )
12
13    # Same document_id - replaces previous memories
14    client.retain(
15        bank_id="user-profile",
16        document_id="employment-info",
17        messages=[
18            {"role": "user", "content": "I just joined NewStartup as CTO"}
19        ]
20    )
21
22    # Query returns only the updated info
23    results = client.recall(
24        bank_id="user-profile",
25        query="Where does the user work?"
26    )
27    # Returns: NewStartup as CTO (not Acme Corp)

The old memories about Acme Corp are gone. No duplicates. No conflicts.

Why Full Replacement?

You might wonder why Hindsight replaces all memories under a document_id rather than merging them. The reasoning comes down to fact extraction.

When you call retain(), Hindsight doesn’t store your raw messages. It extracts atomic facts, entities, and relationships. A single conversation might generate dozens of interconnected memories.

Attempting to merge new facts with old ones creates several problems:

Conflict resolution is ambiguous: If old facts say “works at Acme” and new facts say “works at NewStartup”, which wins? You need full context to decide.
Temporal consistency breaks: Facts extracted together have temporal relationships. Mixing facts from different extraction passes loses this coherence.
Graph integrity suffers: The knowledge graph links facts to each other. Partial updates can leave dangling edges or contradictory paths.

Full replacement is deterministic. Same input, same output. You always know exactly what memories exist under a document_id.

Use Cases

Evolving Conversation Sessions

The most common use case: updating memories as a conversation progresses.

1# Use conversation/session ID as document_id
2session_id = "conv-abc-123"
3
4# After each user message, re-retain the full conversation
5client.retain(
6    bank_id="agent-memory",
7    document_id=session_id,
8    messages=conversation_history  # Full history, not just latest
9)

Each retention replaces the previous extraction. As the conversation evolves, so does the memory - without accumulating outdated interpretations.

Document Sync

If you’re ingesting external documents that update periodically (wikis, docs, specs), use the document path or ID:

1def sync_document(doc_path: str, content: str):
2    client.retain(
3        bank_id="knowledge-base",
4        document_id=f"doc:{doc_path}",
5        messages=[{"role": "system", "content": content}]
6    )

Re-running sync with updated content replaces old facts. No need to track what changed.

User Profile Updates

User preferences and profile data change over time:

1def update_user_profile(user_id: str, profile_text: str):
2    client.retain(
3        bank_id="user-profiles",
4        document_id=f"profile:{user_id}",
5        messages=[{"role": "user", "content": profile_text}]
6    )

Each profile update keeps memories current without manual cleanup.

What About Historical Accuracy?

A reasonable concern: if memories get replaced, don’t you lose history?

It depends on what you need. Document upserting is for current state - what’s true now. If you need change history - what was true before - use different document_ids:

1# Versioned approach for audit trails
2version = datetime.now().isoformat()
3client.retain(
4    bank_id="audit-log",
5    document_id=f"profile:{user_id}:v{version}",
6    messages=[...]
7)

Or keep the latest in one document and append history to another:

 1# Current state (upsert)
 2client.retain(
 3    bank_id="user-data",
 4    document_id=f"current:{user_id}",
 5    messages=[profile_data]
 6)
 7
 8# Historical record (append)
 9client.retain(
10    bank_id="user-data",
11    document_id=f"history:{user_id}:{timestamp}",
12    messages=[profile_data]
13)

The Implementation

Under the hood, document_id maps to a stable identifier in the database. When you call retain() with an existing document_id:

All facts, entities, and graph edges from the previous document are marked for deletion
New extraction runs on the provided messages
New facts are inserted with the same document_id
Old facts are removed in a single transaction

This is fast - the extraction pipeline is optimized for real-time use cases. For most conversations, retention completes in hundreds of milliseconds.

By default, retain() runs synchronously - you call it and wait for completion. This is what you want when memories must be available immediately for the next query. But when you’re batch-ingesting documents or don’t need instant availability, you can run it asynchronously:

1# Sync (default) - blocks until complete
2client.retain(bank_id="my-bank", document_id="doc-1", messages=[...])
3
4# Async - returns immediately, processing happens in background
5client.retain(bank_id="my-bank", document_id="doc-1", messages=[...], retain_async=True)

The upsert happens atomically either way. Queries during the update see either the old state or the new state, never a partial mix.

The graph connections get rebuilt from scratch. If the new content mentions the same entities, fresh relationship edges are created. If entities are no longer mentioned, their edges from this document disappear (though the entities themselves persist if referenced by other documents).

Practical Considerations

Choose document_ids carefully. They’re your update granularity. Too coarse (one ID for everything) and you can’t update selectively. Too fine (one ID per message) and you lose the upsert benefit.

Full conversation retention works best. Rather than retaining individual messages, retain the full conversation with each update. This lets extraction see full context and produce more accurate facts.

Idempotency is built-in. Retaining the same content with the same document_id produces the same memories. Safe to retry on failure.

Document upserting is a simple concept with significant practical impact. In my experience building agents with long-term memory, data staleness is a constant battle. The document_id pattern - explicit identifiers for upsert semantics - handles the common case cleanly without complex deduplication logic.

Hindsight documentation | GitHub