Maniac Docs
Memory

Observational & working memory

Background compression tiers, the remember tool, per-turn memory gates, and query-conditioned relevance filtering.

Observational & working memory

Beyond raw conversation history, Maniac supports two summarized memory tiers that splice into the chat prefix as system messages. Both run background LM jobs so foreground turns stay fast.

TierHeaderUpdated by
Observational[Active observations], [Reflections]Observer + Reflector (background)
Working[Working memory]Post-turn updater + remember tool

Conversation turns in ConversationStore are always preserved. Only memory-tier prefix blocks are candidates for the relevance filter.

Configure on Maniac

import { Maniac } from "@maniac-ai/agents";
import { SqliteMemory } from "@maniac-ai/agents/memory/sqlite";

const memory = new SqliteMemory({ filename: "./agent.db" });
await memory.setup();

const app = new Maniac({
  memory,
  agents: [myAgent],
  observationalMemory: {
    model: cheapModel,
    scope: "thread" // or "resource" for cross-thread pooling
  },
  workingMemory: {
    model: cheapModel,
    scope: "resource", // default — pool profile across threads for same user
    template: "# User profile\n\n## Goals\n\n## Notes\n",
    update_after_every_turn: true
  },
  memoryRelevanceFilter: {
    model: cheapModel,
    min_chars_to_filter: 2000
  }
});

Pass resourceId on chat / chatStream when using scope: "resource":

await app.chat("agent", "thread-1", "Hello", { resourceId: "user-42" });

Observational memory

Observational memory compresses unobserved message tails into dense observation chunks stored in ObservationStore.

Per-turn flow

  1. loadThread — builds the prefix: active observations, reflections, current task, then the recent raw-message tail
  2. Foreground run — the agent sees the compressed prefix instead of the full history
  3. afterTurn — schedules background Observer to compress new slices; Reflector folds observations into higher-level reflections

Activation triggers

Buffered observations become active (included in the prefix) when:

  • Unobserved tail exceeds message_chars
  • Idle timeout elapses
  • Provider changes mid-thread
  • Sync fallback (block_after) fires

Statuses: bufferedactive.

Per-turn opt-out

Gate observational read/write per turn via memory on chat:

await app.chat("agent", "thread", message, {
  memory: {
    observational: "read", // "off" | "read" | "write"
    working: "off"
  }
});

REPL introspection

The runtime exposes an ObservationsProxy for debugging: current(), buffered(), reflections(), state(), recent().

Working memory

Working memory is a single markdown document the agent maintains across turns — preferences, goals, standing instructions.

Each turn:

  1. WorkingMemoryStore.load() → splice as [Working memory]\n<doc>
  2. After the turn, a background LM updater rewrites the doc from prior content + new messages + optional template

The built-in remember(note) tool appends bullets under ## Notes (configurable via note_section).

Default scope is "resource" so a user profile pools across threads when you pass the same resourceId.

Relevance filter

MemoryRelevanceFilter runs a cheap background LM before each foreground turn to score which observation/reflection blocks and working-memory ## sections are relevant to the new user query. It reduces noise for multi-domain agents with large memory prefixes.

Configure on Maniac:

const app = new Maniac({
  memory,
  memoryRelevanceFilter: {
    model: cheapModel,
    min_chars_to_filter: 2000,   // skip LM when candidate body is smaller
    filter_observations: true,    // obs + reflection blocks
    filter_working_memory: true,  // WM sections by ## headers
    keep_when_empty: true,        // fail-open: keep unfiltered if LM drops all
    instructions: "Optional override for the scorer system prompt"
  }
});

Behavior

  • parseMemoryPrefix() decomposes the prefix into block ids (wm.0, obs.2, ref.1, …)
  • The scorer LM returns {"keep": ["wm.0", "obs.2", ...]}
  • Fail-open: LM or parse errors return the unfiltered prefix and emit a memory_filter_error trace event
  • If the LM drops every block and keep_when_empty is true, the unfiltered prefix is kept

Conversation history is never filtered — only [Active observations], [Reflections], and [Working memory] blocks.

Trace events

Filter runs emit tracer events (kind memory) you can surface in UIs or export via OTelTracer. An optional onComplete callback receives a MemoryFilterSummary.

Layering diagram

flowchart TB
  CS[ConversationStore] --> OB[ObservationBuffer]
  OS[ObservationStore] --> OB
  WM[WorkingMemoryStore] --> WMR[WorkingMemoryRunner]
  OB --> prefix[prefixMessages]
  WMR --> prefix
  prefix --> MRF[MemoryRelevanceFilter]
  MRF --> run[runAgent foreground]
  CS --> run

Honcho interaction

Do not enable both HonchoMemory and observationalMemory — Honcho already compacts session context and the SDK logs a warning when both are configured.

On this page