Lessons graph reference

.agentsmesh/lessons/lessons.json is the single source of truth for the recall/capture subsystem. This page documents its schema, the validation codes returned by agentsmesh lessons validate, and the programmatic API for consumers building on top.

For the CLI surface, see agentsmesh lessons.

Graph shape

The graph has four top-level fields:

{
  "version": 1,
  "lessons":  { "<lesson-id>":  { rule, topics, triggers, evidence, status, createdAt, … } },
  "topics":   { "<topic-id>":   { summary } },
  "triggers": { "<trigger-id>": { kind, pattern } }
}

All ids are kebab-case (^[a-z0-9-]+$). The file is deterministically sorted (alphabetical keys at every depth) and always ends with a single trailing newline, so diffs reflect content changes rather than insertion order.

Lesson

Field	Type	Required	Description
`rule`	string	yes	Imperative rule text. Non-empty.
`rationale`	string	no	One-line “why” behind the rule.
`topics`	string[]	yes	At least one topic id. Every entry must resolve in `topics`.
`triggers`	string[]	yes	Trigger ids. Each must resolve in `triggers`. May be empty (the lesson is then never recalled).
`evidence`	string[]	yes	Free-form references (`commit:SHA`, `legacy:.agentsmesh/lessons/topics/foo.md#rule-3`, `lesson:other-id`).
`status`	`"active" \| "deprecated" \| "superseded"`	yes	Recall only returns `active` lessons.
`supersededBy`	string	conditional	Required when `status` is `superseded`; forbidden when `status` is `active`.
`createdAt`	ISO date or datetime	yes	When the lesson was first added. Determines journal ordering.

Topic

Field	Type	Required	Description
`summary`	string	yes	One-line description. Non-empty.

Trigger

Field	Type	Required	Description
`kind`	`"file_glob" \| "command_pattern" \| "keyword"`	yes	Trigger kind.
`pattern`	string	yes	Pattern body. Non-empty.

file_glob patterns use the picomatch syntax. command_pattern patterns are a linear-engine subset of JavaScript regex (no flags) — see Recall semantics for the exact supported/unsupported constructs. keyword patterns are case-insensitive substrings of --keyword, and also match against --file/--cmd on token boundaries (see Recall semantics).

Triggers are content-addressed: the migrator and agentsmesh lessons add both dedupe by (kind, pattern), so two lessons targeting the same glob share one trigger node.

Recall semantics

In plain terms: a lesson comes back from recall when any of its triggers matches the file, command, or keyword you passed. The rest of this section specifies exactly how each trigger kind matches — read it when you need the precise contract; to get started, see the lessons guide instead.

agentsmesh lessons query returns every active lesson whose triggers match any supplied field:

--file <p> matches file_glob triggers via picomatch with dot: true.
--cmd <c> matches command_pattern triggers. Mandatory recall executes these on every command, and a backtracking RegExp cannot be proven linear by inspection (even a+b is quadratic on a long non-matching input). So command patterns are matched by an in-repo non-backtracking engine (a Thompson NFA): matching is linear in the input length for any pattern it can compile, so no pattern — (a+)+, a+a+, or a 30k-char adversarial command — can hang recall.

Supported subset (semantics match new RegExp(pattern) without any flags): literals; escapes \d \D \w \W \s \S, \t \n \r \f \v \0, \xHH, \uHHHH, \cX; . (matches any char except the line terminators \n, \r, U+2028, U+2029 — no dotAll); character classes (ranges, negation, \b = backspace inside a class); anchors ^ $ (end/start of input — no m/multiline, so $ does not match before a final newline) and \b \B; groups ( ) (?: ) (?<name> ) (no capture is needed for matching); alternation |; and quantifiers * + ? {n} {n,} {n,m} (lazy variants accepted, same match result). Not supported (and rejected at capture as UNSAFE_TRIGGER_PATTERN, skipped at read time): backreferences, lookaround, \u{…} (a u-flag-only form), and patterns that expand to too large an NFA (e.g. a{1000} ×10 — match cost is O(states × input), not O(pattern length)). Invalid syntax is INVALID_TRIGGER_PATTERN.

Matching is iterative (no recursion → no stack overflow on long ε-chains), and a query-wide work budget bounds the total matching work across all triggers in a recall (not just per pattern), with no input truncation — so neither a deep pattern, many near-cap patterns, nor a huge command can exhaust the stack or blow the time budget, and a budget-exhausted match degrades to a safe non-match (never a false positive). See isSafeRegexPattern in the programmatic API.
keyword triggers match in two ways: (1) the explicit --keyword <t> via keyword.toLowerCase().includes(pattern.toLowerCase()) (substring), and (2) the --file path and --cmd command — the pattern’s tokens must appear as a contiguous run in the path/command tokens (lowercase, split on non-alphanumerics; the pattern drops single-character tokens and stopwords). This lets a keyword-only (conceptual) lesson surface on mandatory --file/--cmd recall without a hand-crafted --keyword, while token-boundary matching keeps cat from firing on category.ts. Matching is on whole tokens, so joined identifiers (readOnly, readonly) are not split.

Combination is OR across triggers per lesson: a lesson is a candidate if any of its triggers fire from any supplied field. Candidates are then relevance-ranked by a weighted reciprocal-rank fusion (RRF) of three signals — trigger specificity (inverse fanout; highest weight, because a discriminating trigger beats a topic-wide one), per-query topic coherence (a lesson in the topic that dominates this query’s matched set is boosted; middle weight), and BM25 over the rule text (lowest weight; on --file/--cmd recall the rule prose rarely contains a path, so it only breaks ties the structural signals cannot) — with ties broken by recency then id. The CLI/MCP return the top 10 by default, bounded by a default token budget (DEFAULT_RECALL_MAX_TOKENS, 400) so mandatory recall stays lean. Override with --top <n> / --max-tokens <n> (MCP limit/max_tokens); --all bypasses both caps. The single most-relevant result is always returned even if it alone exceeds the budget. The low-level queryLessons export returns the raw unranked candidate set; rankLessons applies the ranking + caps; the migration-aware recallLessons application API wraps load → query → rank (and migrates a legacy store first).

Recall tuning

Per-project tuning lives in .agentsmesh/lessons/config.json, which agentsmesh init --lessons writes with every field at its default so the tunables are discoverable in one place (writing the defaults out is behaviour-neutral — a missing field already falls back to the same value):

{ "recallLimit": 10, "recallMaxTokens": 400, "autoPrune": false, "repairTriggers": false }

recallLimit / recallMaxTokens are the canonical names for the two recall caps; the per-call --top / --max-tokens flags are their invocation-time overrides for the same limits (and --all disables both). Both fields are optional and independently fall back to the built-ins (DEFAULT_RECALL_LIMIT = 10, DEFAULT_RECALL_MAX_TOKENS = 400). recallMaxTokens is approximate — per-rule cost is estimated as rule.length / 4, not a real tokenizer, so treat the budget as a soft bound. Lowering them keeps mandatory --file/--cmd recall lean on a large, high-fanout graph where recall otherwise returns many lessons per call (see the stats break-even / match-count histogram). Reading is fail-safe: a missing or malformed file, or an invalid field (non-positive, non-integer), silently uses the default for that field — recall, a blocking hot path, never throws on config.

autoPrune (boolean, default false) opts into automatic graph hygiene: after every successful capture, the GC-only half of prune runs — orphan triggers/topics are removed and non-stranding dead file_glob triggers are detached, reusing the working-tree walk the capture already did. It never trims a within-cap lesson, drops an active lesson, or strands one (a lesson keeps ≥ 1 trigger), and every change is git-reversible. The capture reports what it cleaned (auto-pruned: N orphan triggers, M orphan topics, K dead globs detached). Over-cap trigger trimming stays exclusive to the manual lessons prune --apply, which remains the deliberate, reviewed curation path.

repairTriggers (boolean, default false) opts into capture-time trigger repair — the enforcement half of the warn-only capture guardrails, applied to the capture INPUT before the write. A broad or wide file_glob is narrowed toward the evidence file’s directory class (src/lessons/recall.ts → src/lessons/*.ts) — but only when a concrete evidence path exists, the author’s glob covers it, and the narrowed glob matches no more files than the original; with no usable evidence the glob is kept and only warned about, never rewritten blind. A stopworded or over-long keyword gets a matchable variant added beside it (the original is kept — it still byte-matches prompt text; the variant makes the mandatory --file/--cmd token path reachable), and a keyword that tokenizes to nothing is dropped. Repair never leaves a lesson with zero triggers. Every rewrite is reported as a warning on the capture result (NARROWED_GLOB, KEYWORD_VARIANT_ADDED, DROPPED_KEYWORD), so the author sees exactly what changed — the narrowed directory class is an automatic compromise, and general/library-behavior rules should still be re-pointed at their true file-CLASS recurrence surface by hand.

Validation codes

agentsmesh lessons validate returns a list of findings with these codes:

Code	Level	Meaning
`SCHEMA_INVALID`	error	The graph fails the Zod schema. Other checks are skipped.
`DANGLING_TOPIC`	error	A lesson references an unknown topic id.
`DANGLING_TRIGGER`	error	A lesson references an unknown trigger id.
`DANGLING_SUPERSEDER`	error	`supersededBy` points to an unknown lesson id.
`DUPLICATE_RULE`	error	Two or more lessons share the same normalized rule text (whitespace-collapsed, lowercased).
`SUPERSEDED_WITHOUT_TARGET`	error	`status: superseded` without a `supersededBy` field.
`ACTIVE_WITH_SUPERSEDER`	error	`status: active` with a `supersededBy` field.
`SELF_SUPERSEDED`	error	A lesson’s `supersededBy` points at itself.
`SUPERSEDE_CYCLE`	error	`supersededBy` links form a cycle.
`INACTIVE_SUPERSEDER`	error	`supersededBy` points at a non-active lesson — the chain dead-ends with no live replacement.
`INVALID_TRIGGER_PATTERN`	error	A `command_pattern` trigger is not a valid regex (would be silently unreachable).
`UNSAFE_TRIGGER_PATTERN`	error	A `command_pattern` the linear matcher cannot evaluate — a backreference, lookaround, `\u{…}`, or a pattern that expands to too large an NFA (e.g. `a{1000}` ×10). Backtracking-prone shapes like `(a+)+` are NOT flagged: the engine runs them in linear time. Rejected at capture, skipped at read time.
`DUPLICATE_TRIGGER`	error	Two or more trigger ids share the same `(kind, pattern)`. Trigger ids are content-addressed, so `add` cannot create this — it only arises from a low-level mutation or hand-edit, and it would distort dedup, fanout, and ranking specificity.
`DUPLICATE_TOPIC_REF`	error	A lesson references the same topic id more than once. `add` unions (dedups); a repeat only arises from a raw mutation and skews accounting.
`DUPLICATE_TRIGGER_REF`	error	A lesson references the same trigger id more than once — double-counts in fanout and skews ranking specificity.
`UNREACHABLE_LESSON`	warning	An active lesson has zero triggers and can never be recalled.
`HIGH_FANOUT_TRIGGERS`	warning	One summary finding: N triggers each match more than 10 active lessons (recall leans on ranking).
`ORPHAN_TOPIC`	warning	A topic node is not referenced by any lesson.
`ORPHAN_TRIGGER`	warning	A trigger node is not referenced by any lesson.
`LOW_SIGNAL_KEYWORD`	warning	A `keyword` trigger on an active lesson carries more than 5 tokens. Recall matches a keyword only as a substring of `--keyword` or a contiguous token-run in the file/command, so a long descriptive pattern almost never fires. Use a short distinctive phrase (fix in place with `untrigger` + re-`add`).
`DEAD_FILE_GLOB`	warning	A `file_glob` trigger on an active lesson matches no file in the working tree — almost always because a refactor renamed the path it pointed at, so the lesson is silently unreachable via that trigger. This is a liveness check, not a breadth one: a narrow glob matching even one file is fine; only a glob matching zero is flagged. Re-point it at the current path, or detach it with `untrigger`. Requires a working tree to check, so it runs in `validate`/`lint` (which know the project root) but never in the `add` write barrier.
`RUNNER_ANCHORED_PATTERN`	warning	A `command_pattern` on an active lesson is anchored to a single package runner (e.g. `^pnpm test`, `^npx vitest`). It won’t fire for the same task run another way — an agent that types `npx vitest` gets nothing from a `^pnpm` lesson. A scope-match gap, not a breadth one: drop the `^<runner>` anchor and key on the task verb (e.g. `\bvitest\b`).
`STOPWORD_KEYWORD`	warning	A `keyword` trigger on an active lesson has a needle that loses all tokens to stopword filtering (e.g. “state of the art” collapses to tokens [state, art] which cannot match the contiguous run “state of the art”). The trigger cannot fire on the mandatory `--file`/`--cmd` recall path. Drop the stopwords or replace the trigger. Curation counterpart to the capture-time `STOPWORD_KEYWORD` guardrail warning; does not fail validation.
`CORRUPT_GRAPH`	error	`lessons.json` could not be parsed at all (bad JSON — e.g. a botched merge resolution). Recall degrades to empty while this stands; the graph is git-tracked, so restore it from git or repair the JSON. Emitted by `validate` itself, since recall’s corrupt-graph warning routes you here.

Exit code is non-zero when any error-level finding exists. Warnings do not affect the exit code. DUPLICATE_RULE considers active lessons only, so superseding or deprecating one copy clears it — that is how merge repairs a duplicate.

Maintenance cadence

Triggers rot as the codebase moves under them — a rename silently turns a good lesson into a DEAD_FILE_GLOB. A light cadence keeps the graph reachable:

Every PR (CI): agentsmesh lint already runs the graph through validate, so dead globs, runner-anchored patterns, and integrity errors surface in the same gate that checks generated drift. Errors fail the build; the liveness/precision findings are warnings you triage.
Occasionally (e.g. monthly, or after a big refactor): run agentsmesh lessons validate and act on the warnings — re-point or untrigger a DEAD_FILE_GLOB, de-anchor a RUNNER_ANCHORED_PATTERN, shorten a LOW_SIGNAL_KEYWORD, then agentsmesh lessons prune --apply to GC orphan triggers/topics and trim over-capped lessons.
On a wrong/contradicted lesson: deprecate it (the recall that surfaces it is the signal). A deprecated lesson stops being recalled immediately.

Capture guardrails

agentsmesh lessons add (and the lessons_add MCP tool) returns non-blocking warnings on the resulting lesson — these steer authors toward a few specific triggers, because over-triggering is the main thing that erodes recall precision. Capture is not rejected for any of these warnings, since losing a lesson is worse than an over-broad one.

The one exception — UNRECALLABLE_LESSON (blocking, exit 2). Capture IS rejected when every trigger on the resulting lesson is dead on the mandatory --file/--cmd recall path. A trigger is dead when: (a) it is a keyword whose needle loses all tokens to stopword filtering (e.g. “state of the art” → tokens [state, art] cannot fire as a contiguous run — the phrase “cannot fire on the mandatory —file/—cmd recall path”); or (b) it is a command_pattern rejected by the linear write barrier as INVALID_TRIGGER_PATTERN or UNSAFE_TRIGGER_PATTERN. A lesson with a mix — one live file_glob and one dead keyword — is NOT rejected; the dead keyword surfaces as a non-blocking STOPWORD_KEYWORD warning. Note that a dead keyword can still match via an explicit --keyword substring call; the rejection applies only when the mandatory file/command path is the sole remaining route.

A second blocking case — OVERSIZED_RULE (exit 2). Capture is rejected when the rule text exceeds 2000 characters. A rule is one imperative sentence; a far longer one is a malformed capture (a pasted log or diff) that would bloat every recall surfacing it. Trim it, or split it into separate lessons. The recall hook also truncates any rule over that bound before injecting it into agent context, so a graph from an untrusted cloned repo cannot flood the context with one giant rule (see Trust model).

The following are non-blocking warnings. The CLI prints them to stderr (stdout stays paste-clean); MCP returns them in the warnings array.

Code	Meaning
`OVERSIZED_LESSON_TRIGGERS`	The lesson carries more than the recommended cap (8) of triggers — it fires on too many edits and dilutes recall.
`BROAD_GLOB_TRIGGER`	A `file_glob` matches large swaths of the tree (a bare star/globstar, or a globstar with a wildcard basename). Prefer a path specific to the lesson.
`KEYWORD_ONLY_LESSON`	Every trigger is a `keyword`. Mandatory `--file`/`--cmd` recall surfaces these only when the keyword appears as a path/command token, so it fires less reliably — add a `file_glob` or `command_pattern` trigger for precise recall.
`LOW_SIGNAL_KEYWORD`	A `keyword` trigger carries more than 5 tokens. Recall matches a keyword only as a substring of `--keyword` or a contiguous token-run in the file/command, so a long descriptive pattern almost never fires and the lesson silently falls back to its `file_glob`. Use a short distinctive phrase.
`STOPWORD_KEYWORD`	A multi-word `keyword` pattern contains stopwords/short words (e.g. “state of the art”). Recall filters them from the pattern but not from the file/command text, so the phrase cannot fire as a contiguous run on the `--file`/`--cmd` path — the trigger cannot fire on the mandatory —file/—cmd recall path. Drop the stopwords (“state art”). When this is the ONLY trigger kind, the capture is rejected as `UNRECALLABLE_LESSON` (see above).
`DEAD_GLOB`	A `file_glob` trigger on the captured lesson matches no file in the working tree — likely a rename or typo. Re-point it or the lesson will be unreachable via that glob. (Distinct from the `DEAD_FILE_GLOB` validation code, which is the curation counterpart emitted by `validate`.)
`NEAR_DUPLICATE_LESSON`	The new lesson closely paraphrases an existing active lesson (token-Jaccard ≥ 0.6 over rule text). Suggests updating lesson X instead of adding a paraphrase. Not emitted on an exact re-capture (which upserts) or on any upsert path.

Curation (`prune`)

agentsmesh lessons prune curates the graph in two safe, deterministic, git-reversible steps. It is dry-run by default (prints the plan, writes nothing); pass --apply to write through the transactional path, and --cap <n> to override the per-lesson trigger cap (default 8):

Trim over-cap active lessons down to cap triggers, dropping the highest-fanout (least specific) triggers first — those are the topic-wide broadcasters that dilute recall — never below one trigger.
Remove dead triggers — those no active lesson references (orphans, or referenced only by superseded/deprecated lessons) — from the table, stripping their now non-recall-bearing references everywhere so nothing dangles.

HIGH_FANOUT_TRIGGERS from validate tells you when a prune would help.

Recall telemetry (`stats`)

Recall runs before each edit and each state-changing command (pure-read commands and the recall query itself are exempt), so its frequency — not its per-call payload — is the dominant token cost. Telemetry is opt-in and off by default: set AGENTSMESH_LESSONS_TELEMETRY=1 and each recallLessons call appends one JSON row to .agentsmesh/lessons/recall-log.jsonl. Rows carry field-presence booleans (hasFile/hasCommand/hasKeyword) plus match counts, returned-token cost, per-trigger-kind provenance, the ids of the returned lessons, a bypassed flag for --all diagnostic dumps, and an optional AGENTSMESH_SESSION_ID correlator — never the raw file / command / keyword text. stats groups rows into sessions (by the session id, or a >30-minute idle gap when none is set) and reports the honest per-session break-even: preloading the active set costs wholeActiveSetTokens once per session, so the comparison multiplies preload by the session count and excludes --all dumps from the mandatory-recall side, rather than pitting a whole multi-session window of recalls against a single preload. It also reports the intra-session redundancy rate — the share of delivered rule-tokens that re-deliver a lesson already shown earlier in the same session (the dedup opportunity), with a coverage figure since rows predating the lesson-id field can’t be measured. The log is telemetry, not the canonical graph; the recall hot path computes and writes nothing when telemetry is disabled. It is also size-capped — once it grows past the cap it self-truncates to the most recent records on the next write, so it cannot accumulate unbounded even if tracked. agentsmesh init --lessons additionally adds the recall-log.jsonl, capture-log.jsonl, and outcome-log.jsonl telemetry logs (all under .agentsmesh/lessons/) to your project’s .gitignore, keeping runtime logs out of git entirely (git’s .gitignore, not the generation-level .agentsmesh/ignore).

agentsmesh lessons stats (programmatic: summarizeRecall(records, graph)) aggregates the log into: no-match rate, returned-token p50/p90/max, cumulative recall cost vs. the whole-active-set preload baseline (the break-even that answers “is per-action recall cheaper than loading every active rule once per session?”), and reachability — the share of recalls that fired only via keyword, and the count of active lessons whose every trigger is a keyword (invisible to mandatory --file/--cmd recall).

Capture telemetry

AGENTSMESH_LESSONS_TELEMETRY=1 also enables a symmetric capture log at .agentsmesh/lessons/capture-log.jsonl — the counterpart to recall-log.jsonl. Each captureLesson / agentsmesh lessons add call appends one row recording presence/count fields only (never rule text): isNewLesson, isNewTopic, newTriggerCount, triggerKinds ({file, command, keyword} counts), blocked (boolean — true when the capture was rejected as UNRECALLABLE_LESSON), warningCodes (array of non-blocking warning codes), and optional session and lessonId.

agentsmesh lessons stats prints a Capture block (included under a capture key in --json output) even when only a capture log exists and no recall log is present. The capture block reports: total captures, blocked count, new-lesson vs. upsert split, new-topics count, warned count, trigger-kind breakdown (file / command / keyword shares), and a recall:capture ratio (recalls per capture — a health signal for the capture discipline).

Effectiveness telemetry (opt-in)

The same AGENTSMESH_LESSONS_TELEMETRY=1 gate enables an outcome log at .agentsmesh/lessons/outcome-log.jsonl — the substrate for lesson effectiveness: did a delivered lesson actually prevent the repeat? It appends two event kinds — delivered (a lesson injected for an action, stamped with its per-batch relevance rank) and failure (a failure observed for an action) — both keyed by the normalized action (file:<path> or cmd:<class>), never raw error text, plus the session correlator (the harness hook session_id, or AGENTSMESH_SESSION_ID for shell-driven recalls) so a failure only impeaches deliveries within its own session. Effectiveness is derived at read time: a lesson delivered for an action that then fails again is a miss. Four existing outputs consume it — no new command or flag:

Recall down-ranking. A lesson that fired but never helped sinks in the ranking (a low, tie-breaking weight; neutral when there is no data, so ranking is unchanged by default).
validate health view. validate gains two warning-level findings — INEFFECTIVE_LESSON (delivered enough times, never helped — consider deprecate or a sharper rule) and UNCOVERED_FAILURE (a repeat file failure with no lesson to prevent it — consider add). They are advisory and never change the exit code.
stats effectiveness block. lessons stats adds a benefit summary alongside its cost (break-even) and activity (capture) blocks: total deliveries, the coarse held rate (share of deliveries with no recorded repeat on the same action — a weak upper bound, never presented as proof of prevention), the ineffective-lesson count, and failures observed — with a pointer to validate for the actionable list.
The recall hook’s recurrence gate. On a PreToolUse first touch of an action whose failure history has reached the recurrence threshold and that a captured lesson covers, the hook escalates: the covering rule is re-injected above the regular recall bullets with the failure count, cutting through session dedup — advisory context, once per action per session. See hook mode.

The signal is deliberately coarse and labeled as such — an honest weak signal beats a precise-looking fake one. Like the other logs it is opt-in, size-capped, and gitignored by init --lessons. It is also per-machine: the graph (your rules) is shared, but effectiveness reflects your own recall/failure history, so the health view and ranking can differ between developers — see Sharing and cross-agent use.

The graph .agentsmesh/lessons/lessons.json is one committed file, shared by every agent and every teammate. A lesson captured by one agent (Cursor, Aider, Claude Code, …) is immediately recalled by any other — the graph is target-neutral, and recall/capture are reachable two ways from any agent: the CLI (agentsmesh lessons query / add) for shell-capable agents, or the MCP tools (lessons_query / lessons_add) for agents without a shell. Recall itself reaches every target through one of two channels: a hook on hook-capable targets (deterministic, no extra model turn) and the always-on lessons paragraph in the root instructions everywhere else — so no agent is left without recall.

Effectiveness (above) is the one part that stays per-machine by design: sharing a hot append-only log would leak timing/behavior and create merge noise. Teams share what matters — the rules — and each machine measures effectiveness against its own history. Pooling effectiveness across a team is a possible future opt-in; the event keys are already machine-independent, so the seam exists.

Trust model

lessons.json is committed to the repository, so its rules are project content — trusted at the same level as the source code, CLAUDE.md, and the rest of .agentsmesh/. The recall hook injects matching rule text into the agent’s context, which makes the graph an input that influences agent behavior.

The practical consequence is for cloned third-party repositories: a lesson graph you did not author is untrusted input, exactly like the code in that repo. Review it before relying on its rules, the same way you would read code before running it. agentsmesh does not sanitize rule text — a rule is free prose by design — but it bounds the blast radius:

Length cap. Capture rejects a rule over 2000 characters (OVERSIZED_RULE), and the recall hook truncates any rule over that bound before injecting it, so a malformed or hostile graph cannot flood the agent’s context with one giant rule.
Hook input bound. agentsmesh lessons hook caps the stdin payload it reads, so a runaway producer cannot exhaust memory.
No code execution. Recall matches command_pattern triggers with a non-backtracking linear engine (no native RegExp, so no ReDoS), and never executes a trigger or rule — it only matches and prints.
Telemetry stays local. The opt-in logs record presence/count fields only (never the file/command/rule text) and never leave the machine.

Concurrency

Every mutation routes through one transactional write path (mutateLessonsGraph): acquire the lock → load → mutate → validate → atomic temp-write + rename. This includes add, merge, deprecate, untrigger, strip-markers, prune, migration, and scaffolding — none write the graph directly. The lock at .agentsmesh/lessons/.lessons.lock (mkdir-based, cross-platform) reuses the same primitive as .generate.lock/.install.lock (stale-eviction by PID, signal cleanup, retry budget). Concurrent writers serialize cleanly, an invalid mutation is never persisted, and a crash mid-write cannot truncate the graph.

One-shot upgrade migration

Earlier releases stored lessons as .agentsmesh/lessons/index.yaml plus per-topic Markdown files plus an append-only journal.md. On upgrade, the migrator collapses all three into lessons.json and deletes the legacy files:

agentsmesh lessons import-md

It also runs lazily on the first lessons subcommand when lessons.json is absent and index.yaml exists. New projects skip this entirely.

Migrated lessons carry their legacy provenance in evidence[0] as legacy:.agentsmesh/lessons/topics/<topic>.md#rule-N, plus any (Evidence L<n>) references parsed verbatim from the original bullets (preserved as legacy:L<n>).

Programmatic API

The new graph types and helpers are exported from agentsmesh/lessons:

import {
  // Blessed, migration-aware application APIs — prefer these.
  recallLessons, // migrate → load → query → rank (+ default token budget)
  captureLesson, // migrate → add (transactional)
  // Write path — migrates a legacy store first (safe to use directly).
  mutateLessonsGraph, // migrate → lock → load → mutate → validate → atomic save
  addLesson,
  mergeLessons,
  // Low-level READ primitives — do NOT migrate; see note below.
  loadLessonsGraph,
  tryLoadLessonsGraph,
  maybeAutoMigrateLessons, // run before a read primitive on a possibly-legacy project
  queryLessons, // raw candidate matcher (no migration, no ranking)
  rankLessons, // relevance ranking + caps over candidates
  isSafeRegexPattern, // true if the linear engine can match the command_pattern
  validateLessonsGraph,
  importLegacyLessons,
  DEFAULT_RECALL_MAX_TOKENS,
  type LessonsGraph,
  type Lesson,
  type Topic,
  type Trigger,
  type LessonsQuery,
  type ValidationReport,
} from 'agentsmesh/lessons';
// NOTE: raw saveLessonsGraph is intentionally not exported — it bypasses
// locking + validation. `mutateLessonsGraph` (and add/merge/deprecate built on
// it) MIGRATE a legacy index.yaml before writing, so a first write never strands
// it. Only the low-level READ primitives (tryLoadLessonsGraph/loadLessonsGraph/
// queryLessons) skip migration — use recallLessons, or maybeAutoMigrateLessons.

// Migration-aware recall (default token budget applied unless overridden):
const { lessons, totalMatches } = await recallLessons(process.cwd(), {
  file: 'src/cli/x.ts',
});
console.log(`${lessons.length} of ${totalMatches} shown`);

await captureLesson(process.cwd(), {
  rule: 'Normalize CLI display paths.',
  topic: 'windows-paths',
  triggers: { files: ['src/cli/**/*.ts'] },
  evidence: ['commit:abc1234'],
});

Token-cost comparison

The recall flow saves tokens by returning only the lessons that match — agents never read the whole graph.

Step	Legacy YAML + MD	JSON graph
Recall	Read 400+ line `index.yaml` + matching topic Markdown files. ~2–5k tokens per session.	One `query` call returning only the relevance-ranked top matches (default top 10 and a ~400-token budget; `--top`/`--max-tokens` to adjust, `--all` to bypass). Typically a few hundred tokens; the caps keep a broad trigger from returning a whole topic.
Capture	Edit `journal.md` + matching `topics/<topic>.md` + reconcile `index.yaml` triggers. Three file ops, often skipped.	One `add` call. Atomic.