Skip to content

Hermes + Obsidian Personal Knowledge Base Plan

For Hermes: Use subagent-driven-development skill to implement this plan task-by-task after the design is accepted. Use Obsidian skill for vault file operations. Use hermes-agent skill before changing Hermes configuration, MCP, memory providers, cron, gateway, or toolsets.

Goal: Build a local-first personal knowledge base where Obsidian is the human-facing workspace and Hermes is the agentic research, ingestion, retrieval, synthesis, lint, and automation layer.

Architecture: Keep markdown + git as the canonical store. Use Obsidian for editing, review, backlinks, properties, Bases/Dataview views, and manual clipping. Use Hermes to ingest sources, maintain schema/index/log, retrieve context, produce cited syntheses, run quality gates, and optionally automate background review jobs. Derived indexes such as SQLite FTS/BM25, embeddings, graph views, and MCP/REST integrations are optional and must be rebuildable from markdown.

Tech Stack: Obsidian vault, markdown/YAML properties, git, Hermes file/search/session/memory/cron/MCP tools, optional Obsidian Local REST API or Obsidian MCP server, optional Dataview/Bases, optional Docsify/GitHub Pages publication, optional SQLite FTS5/BM25 and sqlite-vec later.


Executive Summary

The best near-term implementation is not to replace Obsidian with Hermes memory, nor to dump every conversation into Obsidian. Treat Hermes built-in memory as small working memory and Obsidian as the durable personal knowledge substrate.

Hermes memory should keep only compact, high-value facts that reduce future steering: user preferences, stable environment conventions, and procedural lessons. Obsidian should hold the larger corpus: sources, notes, project decisions, literature, personal research, meeting summaries, and evolving syntheses. This matches the existing wiki conclusion that durable agent memory should be inspectable, editable, integrated, and operational. [concepts/llm-wiki-agent-memory-research-framework.md]

The core practice is a three-layer system:

  1. Capture layer: Obsidian Web Clipper, manual notes, pasted sources, Hermes web extraction, PDFs, meeting transcripts, and session exports.
  2. Knowledge layer: markdown notes with frontmatter, source maps, Obsidian links, append-only logs, and git history.
  3. Agent layer: Hermes workflows for ingest, query, synthesize, lint, review, and scheduled maintenance.

Do not start with a vector DB. Start with markdown+git, Obsidian search, Hermes search_files, and optionally SQLite FTS/BM25. Add embeddings only after a small evaluation set shows lexical search misses important questions. This follows the research synthesis: vector-only retrieval fails exact strings, while local-first markdown+git provides inspectability and reviewability. [concepts/llm-wiki-agent-memory-research-framework.md]

Design Principles

1. Obsidian is canonical, Hermes is an operator

Obsidian vault files are the source of truth. Hermes can create, edit, lint, and query them, but every non-trivial edit should be visible as a markdown diff and committed to git.

Why: the existing research emphasizes white-box memory, user control, auditability, and git-native review. Black-box memories become hard to inspect and can accumulate junk. [raw/github/mem0-issue-4573-memory-audit-junk.md] [raw/product-docs/openai-chatgpt-memory-2024-2025.md]

2. Separate memory classes

Use distinct folders and schemas for different memory types:

ClassPurposeCanonical locationHermes treatment
User profileDurable preferences and personal facts00-system/user-profile.md plus Hermes USER.md for tiny subsetAsk or confirm before major changes
Project memoryProject conventions, decisions, status20-projects/<project>/Retrieve only when scoped to project
Research knowledgeSources, claims, syntheses30-research/<topic>/Citation-required edits
ProceduresReusable workflowsHermes skills + 40-procedures/Promote stable procedures to skills
Session notesChat/task transcripts and summaries50-sessions/YYYY-MM-DD-*.mdSummarize, do not auto-store all details
Raw sourcesImmutable source text/captures90-sources/ or topic-local raw/Preserve source text + metadata
Private/sensitiveSecrets-adjacent or personal data99-private/Default exclude from automation/indexing

This prevents the Letta-style issue where agent-global memory files pollute unrelated conversations and create privacy/token-cost problems. [raw/github/letta-issue-652-per-conversation-context-scoping.md]

3. Prefer compiled wiki pages over repeated query-time reconstruction

Every useful research answer should be eligible to become a durable page, not just a chat response. This follows Karpathy's LLM Wiki pattern: raw sources -> wiki -> schema, with ingest/query/lint operations. [raw/articles/karpathy-llm-wiki-gist-2026.md]

4. Use provenance at claim granularity when useful

Every non-trivial claim in a concept/plan/research note should cite either a raw note or a source URL. For personal notes, cite session/date or direct user assertion where appropriate.

Minimum citation form:

markdown
Claim text. [source-note.md]

For higher-value notes:

markdown
| Claim | Source | Evidence | Confidence | Status |
|---|---|---|---|---|

5. Treat retrieved memory as context, not new evidence

When Hermes retrieves a note and later writes a summary, it must not re-store the retrieved content as if the user said it again. This directly addresses feedback-loop amplification described in the mem0 audit. [raw/github/mem0-issue-4573-memory-audit-junk.md]

6. Default-reject high-entropy transient state

Negative memory filtering should include an entropy filter that rejects transient state by default. Here entropy does not mean Shannon entropy. It means information density that is low-value for future retrieval: short-lived, low-reuse, context-dependent, hard to interpret later, token-expensive, and likely to pollute retrieval.

High-entropy memory candidates include:

  1. Shell/tool output and command logs. Do not store Ran: npm install plus raw output. Store only the durable resolution when one exists, such as Resolution: package lock corruption caused install failure; deleting the lockfile fixed it.
  2. Agent chain-of-thought or exploratory reasoning traces. Do not store first I thought X, then maybe Y. Store final reasoning, durable conclusions, and optionally rejected hypotheses when they are useful for future debugging.
  3. Repeated retrieval excerpts. Do not re-store paragraphs copied from retrieved notes or raw sources. This creates recursive amplification where future retrieval finds retrieval artifacts rather than original evidence.
  4. Conversational scaffolding. Avoid saving user asked, assistant suggested, then explored narration unless the conversation structure itself is the durable fact. Compress to the decision, for example Decision: use filesystem-first integration.
  5. Temporary operational state. Do not store need to check later, maybe investigate, or could benchmark unless the item is promoted into a TODO system, issue tracker, or review queue.

The filter's default policy is reject-unless-durable. A memory candidate should pass only if it is a stable user preference, durable fact, accepted decision, reusable procedure, unresolved but tracked open question, or source-backed synthesis. Otherwise it belongs in ephemeral session context, not long-term memory.

Single Source of Truth Boundary

The knowledge base is meant to let an agent quickly enter the vault, locate evidence, and answer "what is true here?" without guessing. Therefore the system must distinguish canonical source documents, agent-authored knowledge, and rebuildable indexes.

Canonical layers

LayerCanonical?Mutable?PurposeExamples
Raw sourcesYesNo, except metadata correctionEvidence substrate. Every non-trivial claim must be traceable here.Original article markdown, PDF file, screenshot, image, transcript, GitHub issue export, or a note pointing to an exact local storage path.
Wiki notesYes, but derived from raw sourcesYes, reviewed editsAgent/human-authored concepts, decisions, comparisons, queries, syntheses.concepts/*.md, comparisons/*.md, queries/*.md, project decision notes.
LogsYesAppend-onlyChronology of knowledge-base actions and source changes.log.md, project logs, ingestion logs.
User/project profile notesYes within their scopeYes, stricter reviewDurable user/project facts, preferences, conventions.00-system/user-profile.md, 20-projects/*/project-memory.md.
Hermes built-in memoryNo for corpus truthYes, tiny steering cacheCompact pointer/steering memory only.Vault path, stable preferences, repeated corrections.
Derived indexesNoRebuild-onlyRetrieval acceleration. Must never be the only copy of knowledge.SQLite FTS, BM25 index, vector store, graph projection.
Published siteNoGeneratedRead-only presentation surface.VitePress/GitHub Pages output.

Raw sources are immutable evidence

Raw sources must be maintained separately from wiki/concept notes, following Karpathy's LLM Wiki pattern. Raw sources are not summaries. They are evidence records. A raw source may be:

  1. A preserved original or near-original text extraction, such as article markdown, paper text, transcript text, GitHub issue JSON/markdown, or official docs markdown.
  2. A binary/original artifact stored in the vault or repo, such as PDF, image, screenshot, audio, or downloaded HTML.
  3. A pointer note that records an exact local path, content hash, source URL, retrieval date, and access instructions when the artifact is too large or cannot be copied into the vault.

A raw source may include ## Extraction Notes, but those notes are commentary. They are not the raw source itself and must not replace preserved source content or artifact path.

Allowed edits to raw source files:

  • Add or correct metadata.
  • Add missing artifact paths, hashes, retrieval timestamps, or extraction status.
  • Mark extraction as partial/blocked/truncated.
  • Add an erratum note that the capture was defective.

Disallowed edits to raw source files:

  • Rewriting original source wording for clarity.
  • Deleting inconvenient source text.
  • Collapsing full source content into a summary while still marking it as raw.
  • Mixing synthesis claims into raw text without a clearly labeled analysis section.

Truth lookup order for agents

When asked for a fact, Hermes should resolve truth in this order:

  1. Identify the active scope: topic, project, user, time range, and privacy boundary.
  2. Read the relevant index or topic map to locate candidate wiki notes.
  3. Read the wiki note for current synthesis, confidence, contested status, and source links.
  4. Follow source links to raw sources for verification of non-trivial claims.
  5. If the wiki and raw source conflict, raw source wins as evidence, but the wiki may contain later synthesis explaining the conflict.
  6. If no raw source supports a claim, answer with insufficient evidence or mark the claim as inference/speculation.
  7. If retrieved memories or previous assistant outputs contain a claim but no raw/user-confirmed source, do not treat it as truth.

One-writer rules by artifact type

ArtifactWho may write directlyReview requirement
Raw source artifactCapture tooling or explicit user instructionMetadata-only corrections may be direct; content replacement requires review.
Concept/comparison/query noteHermes or humanDirect patch allowed if source-backed; contested/high-impact changes need review note.
User profile noteHuman or Hermes with explicit confirmationMust show diff; no silent update of sensitive personal facts.
Project memory noteHermes or human within active project scopeDirect patch allowed for stable conventions/decisions; ephemeral task state rejected.
Procedures/playbooksHermes or humanPromote to Hermes skill only when reusable and verified.
Derived indexScript/automation onlyRebuild from canonical markdown; never hand-edit.
Published siteGitHub Actions onlyGenerated from repo state.

Source identifiers

Every raw source should have a stable source_id so wiki notes can cite sources even if filenames move.

Format:

text
src:<type>:<slug>:<year-or-date>

Examples:

text
src:article:karpathy-llm-wiki-gist:2026
src:paper:memgpt:2023
src:github:mem0-issue-4573-memory-audit-junk:2026-05-14
src:clip:obsidian-web-clipper:2026-05-14

Wiki notes may cite both source_id and path. Path is for agent navigation; ID is for durable reference.

Recommended Vault Layout

Use one Obsidian vault for the personal knowledge base, ideally git-backed and local-first.

text
Obsidian Vault/
  00-system/
    SCHEMA.md
    AGENT-RULES.md
    user-profile.md
    memory-policy.md
    review-queue.md
    dashboards/
      research.base
      projects.base
      memory-audit.base
  10-inbox/
    clips/
    notes/
    transcripts/
  20-projects/
    hermes-agent/
      index.md
      decisions/
      tasks/
      sources/
  30-research/
    agent-memory/
      index.md
      concepts/
      comparisons/
      queries/
      raw/
      log.md
  40-procedures/
    skills-candidates/
    playbooks/
  50-sessions/
    active/
    archive/
  60-people/
  70-entities/
  80-attachments/
  90-sources/
    web/
    pdf/
    github/
  99-private/
    .agentignore

For the current /Users/a17/wiki, two practical options exist:

  1. Keep it as a project/research repo and open it directly as an Obsidian vault.
  2. Move or mirror it into the larger Obsidian vault under 30-research/agent-memory/.

Recommendation: keep /Users/a17/wiki as the research repo for this topic, and optionally add it as a separate Obsidian vault. This avoids mixing publication/Docsify files with the user's entire personal vault before the workflow stabilizes.

Note Schemas

Schema discipline is the bridge between Obsidian as a human note app and Hermes as an agentic knowledge operator. The schema should be strict enough for search, dashboards, lint, and automation, but not so complex that humans stop writing notes.

Global frontmatter fields

Every managed note, except README.md and simple generated/publication files, should use YAML frontmatter.

FieldRequiredValuesMeaning
idYesstable slug-like IDDurable reference independent of filename.
titleYesstringHuman-readable title.
typeYesraw_source, concept, comparison, query, decision, project_memory, user_profile, session_summary, procedure, dashboard, indexNote class.
createdYesYYYY-MM-DDCreation date.
updatedYesYYYY-MM-DDLast meaningful content update.
statusYesdraft, active, review, contested, superseded, archivedLifecycle state.
tagsYeslistSearch/dashboard tags.
scopeYesobjectUser/project/topic/channel boundary.
visibilityYesprivate, internal, publicPublication and automation boundary.
agent_readYesbooleanWhether agents may read by default.
agent_writeYesnever, propose, directWhether agents may write directly.
sourcesConditionallist of source IDs or pathsRequired for source-backed wiki notes.
confidenceConditionallow, medium, highRequired for concepts/comparisons/queries/decisions.
contestedConditionalbooleanRequired for concepts/comparisons/queries/decisions.

Raw source note schema

Raw sources are immutable evidence records. They must either contain preserved source text or point to an exact artifact path that Hermes can locate and read with appropriate tools.

yaml
---
id: src:article:example-source:2026-05-16
title: Example Source Title
type: raw_source
created: 2026-05-16
updated: 2026-05-16
status: active
tags: [raw-source, article]
scope:
  users: [a17]
  projects: []
  topics: [agent-memory]
  channels: []
visibility: public
agent_read: true
agent_write: propose
source:
  source_url: https://example.com/article
  original_artifact_path: raw/assets/example-source.html
  local_text_path: raw/articles/example-source.md
  media_paths: []
  captured_by: hermes-web | obsidian-clipper | manual | pdf-parser | screenshot | api
  captured_at: 2026-05-16T00:00:00Z
  content_sha256: sha256:...
  license: unknown
  access_notes: public web page
source_derivation:
  derived_from: []
  transformation: []
raw_preservation: full_text | full_binary | full_html | full_pdf_text | pointer_only | transformed_text | tool_parsed_or_summarized_text | extraction_blocked
extraction_status: complete | partial | blocked | needs_pdf_pass | needs_manual_review
reliability: high | medium | low
---

Required body sections:

markdown
# Source Title

## Source Metadata

## Original Artifact / Storage Path

- original_artifact_path: raw/assets/example-source.html
- local_text_path: raw/articles/example-source.md
- media_paths: []

## Parsed Source Text

Preserved source text goes here. If source is binary-only, write where the binary lives and how an agent should read it.

## Extraction Notes

Only commentary about extraction quality, missing sections, blocked access, or parser limitations.

Source derivation for transformed sources

Use source_derivation when a source note is not the original artifact but a transformed representation of another raw source. Examples include OCR outputs, transcript cleanups, parsed PDFs, translated versions, normalized HTML extracts, and markdown cleanup passes.

yaml
source_derivation:
  derived_from:
    - src:pdf:memgpt-paper:2023
  transformation:
    - OCR
    - markdown_cleanup

Rules:

  • derived_from must point to source IDs or paths for the upstream raw/original artifact.
  • transformation must list every meaningful processing step that changed representation or wording.
  • A transformed source is still evidence, but it is not the root evidence. Agents should follow derived_from when exact wording, layout, figures, or legal/provenance questions matter.
  • Translations must record source language and target language in transformation or Extraction Notes.
  • Cleanup-only transformations must not silently remove uncertainty, OCR errors, speaker labels, timestamps, page numbers, or source line/page references.

Concept note schema

Concept notes are agent/human-authored synthesis pages. They are mutable, but every factual claim should trace to raw sources.

yaml
---
id: concept:hermes-obsidian-personal-kb
title: Hermes Obsidian Personal Knowledge Base Plan
type: concept
created: 2026-05-14
updated: 2026-05-16
status: active
tags: [agent-memory, obsidian, hermes]
scope:
  users: [a17]
  projects: [wiki]
  topics: [agent-memory, personal-knowledge-base]
  channels: []
visibility: public
agent_read: true
agent_write: direct
sources:
  - src:article:karpathy-llm-wiki-gist:2026
  - raw/articles/karpathy-llm-wiki-gist-2026.md
confidence: medium
contested: true
review:
  last_reviewed: 2026-05-16
  next_review: 2026-06-16
---

Required body sections:

markdown
# Title

# Executive Summary
# Claims
# Architecture / Analysis
# Open Questions
# Source Map
# Current Corrections / Evidence Gaps

Decision note schema

yaml
---
id: decision:project:short-title:2026-05-16
title: Decision Title
type: decision
created: 2026-05-16
updated: 2026-05-16
status: accepted
scope:
  users: [a17]
  projects: [hermes-agent]
  topics: []
  channels: []
visibility: private
agent_read: true
agent_write: propose
sources: []
confidence: medium
contested: false
supersedes: []
superseded_by: null
---

Required body sections:

markdown
# Decision
## Context
## Options Considered
## Decision
## Consequences
## Evidence / Sources
## Review Date

Session summary schema

Session notes are not raw memory dumps. Knowledge is compressed state transition, not interaction history. The purpose of a session note is not to replay a transcript, chronology, tool log, or agent reasoning trace; it is to distill future-relevant state transitions from an interaction.

Memory pipeline:

text
interaction

working context

temporary scratch

candidate extraction

entropy filter

durable knowledge

retrieval index

Do not create a durable session note for every session. Create one only when the interaction crosses the memory extraction threshold below. Even then, session summaries are not permanently authoritative: session is a candidate-knowledge temporary container and interaction buffer, not long-term knowledge.

Memory class model

Retention is determined by memory class, not by folder. Use four classes:

ClassPurposeLifecycleDefault retrieval
canonicalCore long-term knowledgePermanentYes
semanticDistilled long-term experienceLong-termYes
operationalProject runtime stateMedium-termScoped only
episodicSession-level process recordShort-termWeak by default

Session notes should normally be memory_class: episodic. Their job is interaction buffer -> extraction substrate, not permanent conversational archive.

Canonicalization rule

Promote durable items out of sessions as soon as practical:

Session contentTarget location
Architecture decisiondecisions/
Stable workflowprocedures/
Durable synthesisconcepts/
Reusable investigationresearch/
Stable preferenceuser-profile.md
Validated operational ruleproject memory

Once promoted, the session's value and retrieval priority should decline. The session should retain only a Canonicalized Items pointer list.

Minimal session frontmatter

Avoid excessive metadata and YAML bloat. Keep only operationally useful fields:

yaml
---
id: session:2026-05-16-hermes-memory-design
title: Hermes Memory Design Session
type: session_summary

created: 2026-05-16
updated: 2026-05-16

memory_class: episodic

status: active

retention:
  mode: adaptive
  half_life_days: 30
  retrieval_weight: 1.0

memory_stats:
  retrieval_count: 0
  citation_count: 0
  promoted_items: 0

scope:
  projects:
    - wiki
  topics:
    - memory-architecture

contains:
  decisions: true
  procedures: true
  source_analysis: true
  transient_debugging: false

canonicalized: false
archive_candidate: false
---

Required body sections:

markdown
# Session Summary
## Durable Outcomes
## Decisions
## Procedures Validated
## Sources Added
## Open Questions
## Canonicalized Items
## Rejected / Do Not Store

Forbidden sections/content by default:

  • Transcript dump / record
  • Chronological replay
  • Tool logs
  • Shell output spam
  • Chain-of-thought replay
  • Repeated retrieval/source excerpts

Memory extraction threshold

A session should become a durable note only if at least one future-relevant state transition occurred.

Save a session note when any of these are true:

  • Architecture changed.
  • Durable preference discovered.
  • Reusable procedure validated.
  • Source added.
  • Decision finalized.
  • Long-term research synthesis produced.
  • Unresolved question identified and worth tracking.

Do not save a durable session note for:

  • One-off debug.
  • Retry command loops.
  • Casual brainstorming with no decision or reusable output.
  • Failed experiments with no reusable conclusion.
  • Short QA.
  • Temporary planning.
  • Generic how-to help, such as how do I install X, unless it yields a reusable procedure or project convention.

Adaptive retention and half-life

Session half-life does not mean deleting notes after 30 days. It means dynamically decaying retrieval priority while preserving auditability.

Retention stages:

  1. Active: new session has retrieval_weight: 1.0 and may participate in scoped retrieval.
  2. Decaying: after each half_life_days, set retrieval_weight *= 0.5. Keep the file, but do not proactively place it into context unless scoped retrieval needs it.
  3. Archive candidate: set archive_candidate: true when retrieval_count == 0, citation_count == 0, canonicalized == true, and age exceeds 90 days.
  4. Compression: archived sessions should be compressed from episodic record into semantic outcome. Example: explored REST API; tested filesystem writes; benchmarked latency; decided filesystem-first becomes Outcome: filesystem-first integration selected for MVP stability.
  5. Deletion: rare and requires human confirmation. Delete only if canonicalized, unreferenced, no inbound links, no project dependency, long-term retrieval count is 0, and there is no audit value.

Reinforcement rule: session memory can strengthen as well as decay. If a session is retrieved or cited, increment retrieval_count or citation_count and add a small reinforcement bonus to retrieval_weight, bounded by 1.0. Valuable operational memory stays discoverable; noise naturally decays.

Retrieval policy for session memory

Episodic sessions should not outrank canonical knowledge. Retrieval priority should be:

  1. Canonical concepts
  2. Decisions
  3. Procedures
  4. Project memory
  5. Semantic syntheses
  6. Active operational notes
  7. Episodic sessions
  8. Archived sessions

This prevents session noise from contaminating long-term knowledge retrieval.

Canonicalization pipeline

Session memory should not directly serve retrieval forever. Durable knowledge should move through:

text
session

extract durable items

promote to canonical notes

lower session importance

Example: a session note that records Filesystem-first integration selected should produce decisions/filesystem-first-architecture.md. The session then keeps only:

markdown
## Canonicalized Items
- [[filesystem-first-architecture]]

The operational goal is:

text
session entropy

semantic extraction

canonical knowledge

not storing more sessions forever.

Automation and metrics

Automation may update retrieval_count, calculate decay, mark archive candidates, draft compression proposals, and suggest canonical extraction. Human confirmation is required for deletion, modifying canonical knowledge, publishing session content, and changing user profile facts.

Track only minimal operational metrics:

yaml
memory_metrics:
  active_sessions:
  archived_sessions:
  avg_retrieval_count:
  canonicalization_rate:
  stale_session_ratio:
  orphan_session_ratio:

Recommended session directory layout:

text
50-sessions/
  active/
  archive/

Do not split sessions too deeply by year; sessions should not be the primary navigation layer. Long-term navigation belongs in concepts/, decisions/, procedures/, and projects/.

Procedure note schema

yaml
---
id: procedure:ingest-source
title: Ingest Source Procedure
type: procedure
created: 2026-05-16
updated: 2026-05-16
status: active
tags: [procedure, ingestion]
scope:
  users: [a17]
  projects: [wiki]
  topics: [agent-memory]
  channels: []
visibility: public
agent_read: true
agent_write: direct
sources: []
promote_to_skill: false
---

Required body sections:

markdown
# Procedure
## Trigger
## Inputs
## Steps
## Validation
## Failure Modes
## Commit Message

Hermes Roles

1. Ingest operator

Input: URL, PDF, pasted text, GitHub issue/repo, meeting transcript, or user instruction.

Output:

  • raw source note with preserved parsed text
  • extracted claims/entities if useful
  • updated topic index/log
  • optional concept page update
  • git commit

Guardrails:

  • Preserve raw text before synthesis.
  • Mark blocked/truncated extraction explicitly.
  • Do not create many tiny pages for passing mentions.
  • Search existing notes before creating new notes.

2. Retrieval/context operator

Input: user question or task.

Output:

  • structured working set: role-separated, token-budgeted, semantically compressed execution context
  • optional answer with citations

Retrieval order:

  1. Current project/topic index
  2. Exact search over filenames/tags/headings
  3. Full-text/BM25 search
  4. Structured filters over tags/frontmatter
  5. Optional semantic search
  6. Raw source fallback

After retrieval, use Working Set Assembly v1: retrieval is only candidate generation, clusters are the meaning units, and the final working set is a runtime artifact rather than a stored note.

3. Synthesis editor

Input: set of sources/notes and a target note.

Output:

  • proposed patch to target note
  • source map update
  • log update

Guardrails:

  • Never silently overwrite raw sources.
  • Preserve conflicts and mark contested.
  • Do not turn low-confidence source snippets into high-confidence claims.

4. Lint/audit operator

Checks:

  • broken wikilinks
  • missing frontmatter
  • notes without sources
  • raw files without Parsed Source Text
  • stale next_review dates
  • orphan notes
  • duplicate concepts
  • uncited claims in concept notes
  • private folder accidentally referenced by public pages

5. Reflection/consolidation operator

Runs periodically or manually. It should propose, not automatically apply, major memory changes.

Inputs:

  • recent session summaries
  • project logs
  • inbox notes
  • review queue

Outputs:

  • candidate updates to user profile, project memory, or concept pages
  • candidate Hermes skill updates
  • rejected/no-store list for ephemeral facts

Hermes Built-in Memory vs Obsidian

Hermes built-in memory is intentionally bounded. Hermes documentation describes two core files, MEMORY.md and USER.md, injected at session start as a frozen snapshot with small character limits. That makes it useful for compact durable steering, not a full personal knowledge base.

Session summaries also are not the full personal knowledge base. They are short-lived candidate-knowledge containers. Promote durable content into canonical concepts, decisions, procedures, project memory, source-backed syntheses, or stable preferences; then let the session half-life mechanism lower retrieval priority.

Use Hermes memory for:

  • Stable user preferences
  • Stable environment facts
  • Repeated corrections
  • High-value conventions
  • Pointers to canonical Obsidian vault/repo paths
  • Durable resolutions or procedures that would prevent repeated future debugging

Do not use Hermes memory for:

  • Raw source text
  • Research corpora
  • Large project histories
  • Completed task logs
  • Temporary TODOs
  • Detailed meeting notes
  • Shell/tool output or install/build logs
  • Agent chain-of-thought or exploratory reasoning traces
  • Repeated excerpts from retrieved notes, search results, or raw sources
  • Conversational scaffolding with no durable decision
  • Future-maybe operational state that is not tracked in a TODO system, issue tracker, or review queue

Use Obsidian for larger artifacts, and use a review queue or issue tracker for unresolved operational follow-ups. If a candidate only explains what happened in a session but not what should be reused later, reject it from long-term memory.

Obsidian Integration Options

Hermes reads/writes markdown files directly using file tools. This is already enough for local-first workflows.

Pros:

  • Simple
  • No plugin dependency
  • Git-friendly
  • Works when Obsidian is closed

Cons:

  • Does not know active Obsidian pane
  • Cannot trigger Obsidian commands
  • Must be careful with concurrent edits

Option B: Obsidian URI, light automation

Obsidian URI can open notes, create notes, open daily notes, search, and choose vaults via obsidian://.... Useful for generating local links from Hermes output or plan docs.

Use for:

  • Open a note after Hermes writes it
  • Link from dashboards to local Obsidian notes
  • Create daily note from external automation

Avoid relying on URI as the main write API; filesystem edits are easier to diff and test.

Option C: Obsidian Local REST API / built-in MCP, later

The Local REST API plugin provides authenticated HTTPS access to Obsidian and can read/create/update/delete notes, patch headings/frontmatter, search metadata/content, access the active file, manage periodic notes, query tags, and open files in Obsidian. Its README also says it exposes REST API and built-in MCP server interfaces.

Use when you need:

  • Active note context
  • Section/frontmatter patching through Obsidian
  • Tag/metadata operations through plugin APIs
  • MCP clients beyond Hermes file tools

Security note: keep the API bound locally, protect the API key, and do not expose it over the network.

Option D: Obsidian MCP server, optional

Community MCP servers can expose note read/write/search/frontmatter operations to MCP clients. Hermes has native MCP configuration support, so this can become a cleaner integration later.

Do not start here unless filesystem-first editing is insufficient.

Option E: Dataview or Bases dashboards

Dataview is a live index/query engine over markdown metadata and can render tables/lists from frontmatter and inline fields. Obsidian Bases is a core plugin for database-like views of notes and their properties.

Use for human review dashboards:

  • inbox items needing processing
  • raw sources with extraction_status != complete
  • concept notes with contested: true
  • notes where next_review <= today
  • project decision logs
  • memory candidates awaiting approval

Workflow Recipes

Recipe 1: Capture a web source

  1. User clips page with Obsidian Web Clipper into 10-inbox/clips/ or asks Hermes to ingest URL.
  2. Hermes creates/moves a raw note with source frontmatter and ## Parsed Source Text.
  3. Hermes extracts claims/entities into a short ## Extraction Notes section.
  4. Hermes searches existing concept/project pages.
  5. Hermes updates one target synthesis page or creates one if threshold is met.
  6. Hermes updates index/log.
  7. Hermes commits changes.

Recipe 2: Ask Hermes a knowledge question

  1. Hermes identifies active scope: user/project/topic/timeframe.
  2. Hermes retrieves candidates through lexical search, structured filters, and optional semantic search.
  3. Hermes ranks, clusters, compresses, deduplicates, and role-isolates results using Working Set Assembly v1.
  4. Hermes answers from the structured working set with source links.
  5. If the answer is reusable, Hermes asks or infers whether to save it as a query/concept note; the working set itself remains a runtime artifact, not a durable note.

Recipe 3: Convert a session into durable knowledge

  1. Check the memory extraction threshold before creating any durable note. Continue only if architecture changed, a durable preference was discovered, a reusable procedure was validated, a source was added, a decision was finalized, long-term research synthesis was produced, or a worthwhile unresolved question was identified.
  2. Treat the memory pipeline as: interaction -> working context -> temporary scratch -> candidate extraction -> entropy filter -> durable knowledge -> retrieval index.
  3. Extract future-relevant state transitions, not interaction history.
  4. Run the entropy filter before writing anything durable: reject shell/tool output, chain-of-thought, repeated retrieval excerpts, conversational scaffolding, completed task logs, retry loops, and future-maybe operational state.
  5. Keep only durable outcomes, decisions, new knowledge, reusable procedures, open questions, and evidence added.
  6. Use ## Rejected / Do Not Store only for audit-worthy rejections; otherwise omit ephemera entirely.
  7. Update project/concept/procedure notes if needed.
  8. Promote stable procedures to Hermes skills when they are reusable.

Recipe 4: Weekly memory audit

  1. Find notes changed in the last 7 days.
  2. Find memory candidates and user-profile changes.
  3. Check for unprocessed inbox items.
  4. Check raw sources with partial/blocked extraction.
  5. Check contested or low-confidence notes.
  6. Produce a review report and optional patch set.

Retrieval Strategy

Phase 1: lexical only

Use:

  • Obsidian built-in search
  • Hermes search_files
  • git grep/ripgrep via safe wrappers where needed
  • indexes/index.md files
  • tags and frontmatter

This is enough for the first few hundred notes if filenames, tags, and indexes are disciplined.

Phase 2: SQLite FTS/BM25

Add a small derived index:

text
.hermes-kb/index.sqlite
  notes(path, title, type, tags, updated, hash)
  sources(path, source_url, reliability, extraction_status)
  links(src, dst)
  fts_notes(path, title, headings, body)

The index is derived and can be rebuilt from markdown.

Phase 3: hybrid semantic retrieval

Add embeddings only after evaluation shows need. Store vectors outside markdown, keyed by file hash and heading/block IDs.

Use semantic retrieval for:

  • fuzzy conceptual recall
  • paraphrased questions
  • cross-topic discovery

Use exact/BM25 for:

  • names
  • file paths
  • commands
  • dates
  • IDs
  • quotes
  • prices/numbers

Working Set Assembly Standard v1

Working Set Assembly is a deterministic pipeline that transforms scoped retrieval results into a role-separated, token-budgeted, semantically compressed execution context for LLM reasoning.

Goal: convert retrieval results into the minimal sufficient context for LLM reasoning. Retrieval is exploration, not consumption. The cluster, not the note, is the primary meaning unit. The working set is a runtime artifact, not a storage structure.

Input:

  • query
  • scope such as project, topic, or user
  • retrieval_results

Output:

  • structured working_set

Data structures

All intermediate artifacts must be structured to avoid free-text drift.

CandidateNote

yaml
id: string
title: string
type: concept | decision | session | source | procedure
score: float
content: string
metadata:
  project: string
  tags: []
  updated: date

Cluster

yaml
cluster_id: string
theme: string
notes: [CandidateNote]
cluster_score: float

WorkingSetOutput

yaml
system_context: string
project_context: string
knowledge_context:
  clusters: []
evidence_context: []
task_context: string
token_budget:
  system: int
  project: int
  knowledge: int
  evidence: int

Pipeline

Step 1 — Retrieve candidate notes

Inputs:

  • query
  • scope such as project, topic, or user
  • index/search backend

Rules:

  • Use three recall channels: lexical search such as BM25/grep, structured filters such as tags/frontmatter, and optional semantic search.
  • Output candidate_notes[].
  • topK = 30..80; do not make candidate sets too large.
  • Every candidate must include metadata: type, project, and updated.

Step 2 — Rank with fixed scoring

Use a versioned scoring function:

text
ranking_version: v1.0

score =
  0.35 * relevance(query, note)
+ 0.20 * project_scope_match
+ 0.15 * recency_decay(note.updated)
+ 0.15 * citation_frequency(note)
+ 0.10 * canonicality(note.type)
- 0.05 * redundancy_penalty

Canonicality weight order:

text
decision > concept > procedure > source > session

Output a sorted candidate list and keep top 20..40.

Step 3 — Cluster into meaning units

Goal: merge semantically nearby notes into theme blocks.

Prefer rule clustering using:

  • tag overlap
  • shared entities
  • shared project
  • heading similarity

Fallback:

text
cluster_key = dominant_tag OR project OR embedding_similarity

Constraints:

  • cluster count <= 8
  • notes per cluster <= 10

Step 4 — Compress each cluster

Transform each cluster from a collection of notes into a semantic summary unit:

markdown
Cluster: <theme>

Key Claims:
- ...

Key Decisions:
- ...

Key Evidence:
- source refs

Conflicts:
- if any

Compression rules:

  • Delete repeated sentences.
  • Preserve conclusions, not process logs.
  • Preserve conflicts; do not average them away.
  • Preserve source pointers.

Step 5 — Deduplicate

Goal: avoid context pollution through repeated content.

Rules:

  1. Content-hash deduplication and similarity deduplication:
text
if similarity(note_a, note_b) > 0.85:
    keep higher canonicality
  1. Semantic duplicate priority:
text
decision > concept > cluster summary > session > raw
  1. Cross-cluster deduplication: if cluster A and cluster B express the same fact, keep it once and turn the other occurrence into a reference pointer.

Step 6 — Isolate by role

Fixed partitions:

yaml
system_context: rules, constraints, safety
project_context: current scoped project state
knowledge_context: compressed clusters
evidence_context: raw source snippets or quotes
task_context: user query

Partition rules:

  • system_context does not come from retrieval; it is fixed prompt/rules.
  • project_context comes only from scoped notes; do not allow cross-project pollution.
  • knowledge_context contains only cluster compression output.
  • evidence_context contains minimal raw source snippets or quotes.

Step 7 — Assemble final context pack

Fixed token budget:

text
system:   10%
project:  20%
knowledge: 40%
evidence: 20%
task:     10%

Assembly rules:

  1. Order is fixed: system -> project -> knowledge -> evidence -> task.
  2. Evidence must be minimal: only necessary references, no full-text dumps, each item <= 3..8 lines.
  3. Knowledge uses only cluster summaries. Do not concatenate raw notes or dump sessions.
  4. If over budget, trim in this order:
    • session-based content
    • low-score clusters
    • redundant evidence
    • older notes

Maintenance and observability

Version every stage so the pipeline remains reproducible:

yaml
ranking_version: v1.0
clustering_version: v1.0
compression_version: v1.0

Record metrics:

yaml
metrics:
  retrieved_count:
  clustered_count:
  compressed_size:
  final_tokens:
  redundancy_rate:
  evidence_ratio:

Debug mode must support --debug-working-set and output:

  • every step result
  • score breakdown
  • cluster formation
  • compression diff

MVP implementation

Minimum viable implementation:

  1. BM25 retrieve top 30.
  2. Apply simple weighted score.
  3. Use tag-based clustering.
  4. Summarize each cluster with LLM or deterministic rules.
  5. Deduplicate by hash and similarity threshold.
  6. Apply fixed role partition.
  7. Truncate by token budget.

Design principles:

  1. Retrieval is exploration, not consumption.
  2. Cluster is the meaning unit.
  3. Working set is a runtime artifact.

Evaluation Plan

Create 00-system/evals/personal-kb-queries.yml with 30-50 representative questions:

yaml
- id: q001
  question: What is the recommended Hermes memory vs Obsidian split?
  expected_sources:
    - 30-research/agent-memory/concepts/hermes-obsidian-personal-knowledge-base-plan.md
  must_include:
    - Hermes memory is bounded
    - Obsidian stores larger corpus

Measure:

  • retrieval recall@k
  • citation correctness
  • answer faithfulness
  • stale/conflicting answer rate
  • time to update knowledge after new evidence
  • number of rejected junk memories

Automation and Permission Boundary

Automation must be explicit because the knowledge base is both a personal workspace and an agent-readable truth substrate.

Permission levels

LevelMeaningAllowed examples
read_publicAgent may read public/research notes.README.md, concepts/, public raw sources.
read_scopedAgent may read only when current task scope matches note scope.Project memory, session summaries.
read_explicitAgent may read only after explicit user instruction.99-private/, sensitive personal notes.
write_directAgent may patch directly and commit.Index/log updates, non-sensitive source-backed concept edits.
write_proposeAgent may create a patch/proposal, but user must approve.User profile, project decisions, contested claims.
write_forbiddenAgent must not write.Raw artifact content, private secrets, generated indexes by hand.

Folder policy

FolderRead defaultWrite defaultPublish defaultNotes
raw/ / 90-sources/allowedpropose for metadata, forbidden for source content rewriteallowed only if visibility publicImmutable evidence.
concepts/, comparisons/, queries/alloweddirect if source-backedallowed if visibility publicMain wiki layer.
20-projects/scopedpropose/direct depending on projectprivate by defaultAvoid leaking active work.
50-sessions/scopedproposeprivate by defaultSummaries only, not transcript dumps.
00-system/user-profile.mdscopedpropose onlyneverPersonal facts require confirmation.
40-procedures/alloweddirect for non-sensitive proceduresallowed if visibility publicPromote stable procedures to skills.
99-private/explicit onlyforbidden unless explicitneverDefault-deny.
.hermes-kb/, vector stores, search indexestool/script onlyrebuild-onlyneverDerived artifacts.

Automation classes

Safe automation:

  • Rebuild search indexes from markdown.
  • Lint missing frontmatter, broken links, missing raw source fields.
  • Generate read-only dashboards.
  • Append log entries for agent actions.
  • Draft review reports.

Needs review:

  • Editing user profile or personal facts.
  • Marking a contested claim as resolved.
  • Changing confidence from low/medium to high.
  • Deleting or archiving notes.
  • Moving notes across visibility boundaries.
  • Publishing any private/project/session content.

Forbidden without explicit instruction:

  • Reading secrets or private folders.
  • Writing API keys, tokens, passwords, or credentials into notes.
  • Replacing raw source content with summaries.
  • Re-extracting recalled memory as if it were new user input.
  • Publishing 99-private/, .obsidian/workspace*.json, .hermes-kb/, session transcripts, or secrets-adjacent notes.

.agentignore / publication exclusion baseline

text
99-private/**
50-sessions/**
20-projects/**/secrets/**
**/.obsidian/workspace*.json
**/.trash/**
**/*secret*
**/*password*
**/*token*
.hermes-kb/**
node_modules/**
.vitepress/cache/**
.vitepress/dist/**

Human confirmation triggers

Hermes must ask for confirmation or produce a proposal-only patch when:

  • The edit changes a personal preference, identity fact, relationship, medical/financial/legal fact, or other sensitive personal data.
  • The edit changes the system's conclusion about a contested or high-impact claim.
  • The edit deletes, archives, or supersedes a note.
  • The edit changes raw source content rather than metadata.
  • The edit makes private/scoped content public.
  • The task scope does not match the note's scope field.

MVP Operating Loop

The MVP should prove that an agent can enter the vault, find truth, update knowledge, and leave an auditable trail.

MVP scope

Use /Users/a17/wiki as the first standalone Obsidian/VitePress research vault. Do not migrate the full personal vault yet.

MVP includes:

  • Raw source capture under raw/.
  • Concept synthesis under concepts/.
  • index.md and log.md maintenance.
  • Git commits for every completed knowledge change.
  • VitePress publication only for public/research-safe notes.
  • Manual review for private, personal, contested, or destructive edits.

MVP excludes:

  • Vector DB.
  • Graph DB.
  • Automatic personal memory extraction.
  • Automatic publication of project/session/private notes.
  • Autonomous deletion.
  • Autonomous archival without review of archive_candidate proposals.
  • Obsidian REST/MCP dependency unless filesystem-first editing fails.

MVP ingest loop

  1. User provides a source URL/file/path or places a clip in inbox.
  2. Hermes creates a raw source note or artifact pointer with type: raw_source, source_id, storage path, hash, capture date, preservation status, and extraction status.
  3. Hermes verifies that the raw source is readable from the recorded path.
  4. Hermes extracts candidate claims, entities, and open questions into an analysis section or separate draft.
  5. Hermes searches existing concepts before creating new pages.
  6. Hermes patches the most relevant concept/query/decision note with source-backed claims.
  7. Hermes updates index.md if a new durable page was created.
  8. Hermes appends log.md with what changed and why.
  9. Hermes runs lint/build checks.
  10. Hermes commits and pushes.

MVP truth lookup loop

  1. Parse the user question into topic/project/scope.
  2. Search index.md, filenames, headings, and tags.
  3. Read candidate concept notes.
  4. Follow cited raw source paths for important factual claims.
  5. Answer with citations and confidence.
  6. If evidence is missing, say insufficient evidence and optionally create a query note.

MVP session-to-knowledge loop

  1. First apply the memory extraction threshold. Do not generate a durable session note merely because a session occurred.
  2. If the threshold is crossed, create an episodic session summary under 50-sessions/active/ with adaptive retention metadata and the fixed body sections.
  3. Extract state transitions: durable outcomes, decisions, procedures validated, sources added, open questions, and canonicalized items.
  4. Apply negative memory filtering with entropy default-reject: do not store shell/tool logs, chain-of-thought, repeated retrieval excerpts, conversational scaffolding, completed task traces, retry commands, or untracked future-maybe state.
  5. Do not include transcript, chronological replay, tool log, shell output spam, or chain-of-thought sections.
  6. Canonicalize durable items quickly: decisions -> decisions/, workflows -> procedures/, syntheses -> concepts/, stable preferences -> user-profile.md, validated operational rules -> project memory.
  7. After canonicalization, update promoted_items, set or move toward canonicalized: true, and lower session retrieval priority through the half-life mechanism.
  8. During maintenance, decay retrieval_weight after each half-life; mark archive_candidate: true only when canonicalized, old, unused, and uncited; draft compression proposals instead of deleting.
  9. Promote repeated procedures to Hermes skills when verified.

MVP done criteria

  • One new source can be ingested end-to-end with raw source preservation and a source-backed concept update.
  • A fresh Hermes session can answer a question by reading the vault and following raw source links.
  • Lint catches missing raw source metadata, missing frontmatter, and missing source links.
  • VitePress build succeeds after the update.
  • Git history clearly shows what changed.

Implementation Plan

Phase 0: Decide vault topology

Objective: Choose whether /Users/a17/wiki remains a standalone Obsidian vault or becomes a subfolder of a larger personal vault.

Files:

  • Review: /Users/a17/wiki/SCHEMA.md
  • Create later if standalone: /Users/a17/wiki/.obsidian/ through Obsidian UI, not Hermes

Recommendation: Use /Users/a17/wiki as a standalone Obsidian vault for the agent-memory research topic. Later create a separate private personal vault and link/mirror selected research pages.

Verification: Open /Users/a17/wiki in Obsidian as a vault and confirm links render.

Phase 1: Add agent operating rules

Objective: Make the vault self-describing for Hermes and future agents.

Files:

  • Create: AGENTS.md
  • Create: memory-policy.md or 00-system/memory-policy.md if adopting larger layout

AGENTS.md draft:

markdown
# Agent Rules for this Obsidian Wiki

- Preserve raw source text before synthesis.
- Search existing pages before creating new pages.
- Update index.md and log.md for every knowledge change.
- Cite raw sources for non-trivial claims.
- Do not edit raw sources except to fix preservation/extraction metadata.
- Use git status before and after edits.
- Commit and push completed changes when remote is available.
- Do not read or modify private folders unless explicitly instructed.

Verification: Ask Hermes to explain the vault rules in a new session; it should load AGENTS.md automatically when workdir is the repo.

Phase 2: Add Obsidian-facing dashboards

Objective: Make review queues visible to the human in Obsidian.

Files:

  • Create: dashboards/research-review.md
  • Create optional: dashboards/sources-needing-review.md

Dataview example:

markdown
# Research Review

```dataview
TABLE type, confidence, contested, updated
FROM "concepts"
WHERE contested = true OR confidence = "low"
SORT updated DESC

**Bases alternative:** create a Base filtered by `type`, `confidence`, `contested`, `updated`, and `extraction_status`.

**Verification:** Open dashboard in Obsidian and confirm table/base lists notes.

## Phase 3: Add source ingestion command convention

**Objective:** Define a repeatable Hermes prompt/procedure for ingestion.

**Files:**
- Create: `40-procedures/ingest-source.md` or `procedures/ingest-source.md`

**Procedure:**

```markdown
# Ingest Source Procedure

Input: URL/PDF/text and target topic.
1. Capture raw source markdown under raw/<type>/.
2. Add source metadata and hash when possible.
3. Preserve parsed source text.
4. Extract claims/entities with confidence.
5. Search existing concept pages.
6. Patch the most relevant page or create a new one only if threshold is met.
7. Update index.md and log.md.
8. Commit with docs: ingest <source/topic>.

Verification: Run the procedure on one new source and inspect git diff.

Phase 4: Add lint script

Objective: Catch structural drift before the vault becomes unreliable.

Files:

  • Create: scripts/wiki_lint.py
  • Modify: .github/workflows/wiki-maintenance.yml to call the script instead of inline Python

Checks:

  • frontmatter required keys
  • raw files include Parsed Source Text
  • broken markdown links
  • duplicated source_url
  • missing log update for changed concept/raw files
  • private folder excluded from docsify/publication

Verification: Run python3 scripts/wiki_lint.py; expected PASS.

Phase 5: Add session-to-note workflow

Objective: Preserve useful Hermes sessions without polluting durable memory.

Files:

  • Create: procedures/session-to-note.md
  • Create folder: sessions/ or 50-sessions/

Policy:

  • Store concise session summary, not raw transcript by default.
  • Extract durable decisions/open questions/procedures.
  • Do not store temporary command outputs unless needed for reproducibility.
  • Promote reusable workflows to Hermes skills, not just Obsidian notes.

Verification: Convert one prior wiki session into a note and ensure no secrets/transient logs are included.

Phase 6: Optional MCP/REST integration

Objective: Integrate Hermes with Obsidian's active file and plugin APIs only after filesystem-first workflows are stable.

Steps:

  1. Install Obsidian Local REST API plugin.
  2. Keep API local and store token in Hermes .env, not in notes.
  3. Configure Hermes MCP if using an Obsidian MCP server:
bash
hermes mcp add obsidian --command "npx -y obsidian-mcp-server"
hermes mcp test obsidian
hermes mcp configure obsidian

Exact command depends on selected MCP package; verify current package docs before running.

Verification: Use MCP/REST to read the active note and patch a test heading in a scratch note.

Phase 7: Optional retrieval index

Objective: Improve recall once notes exceed what index.md + search_files handles well.

Files:

  • Create: scripts/build_index.py
  • Create derived: .hermes-kb/index.sqlite
  • Add .hermes-kb/ to .gitignore unless intentionally sharing index

Verification: Run benchmark queries and compare recall before/after.

Acceptance Criteria

A working Hermes + Obsidian personal knowledge base should satisfy:

  • Obsidian can browse/edit all notes normally.
  • Hermes can ingest a new source with raw preservation, citation, index/log update, and git commit.
  • Hermes can answer a research question with cited notes.
  • Hermes can distinguish user profile, project memory, raw sources, and session notes.
  • A dashboard shows notes needing review.
  • A lint command catches missing frontmatter, missing Parsed Source Text, and broken links.
  • Sensitive/private folders are excluded from automation and publication by default.
  • The system has a documented retrieval/evaluation plan before adding embeddings.

Open Questions

  1. Should the user's main personal Obsidian vault be separate from public/publishable research vaults?
  2. Should Hermes write directly to Obsidian via filesystem only, or should it use Local REST API for active-file awareness?
  3. Which notes should be eligible for GitHub Pages / Docsify publication?
  4. What is the minimum review UI: Obsidian Dataview/Bases, GitHub PRs, or both?
  5. How should Hermes session summaries be exported: manual /save, session_search summaries, or scheduled cron jobs?
  6. Should a future Hermes memory provider use Obsidian as a backend, or should Obsidian remain a separate canonical KB with retrieval tools?

Source Map

ClaimSourceTypeReliabilityNotes
Durable agent memory should be inspectable, editable, integrated, and operationalconcepts/llm-wiki-agent-memory-research-framework.mdsynthesismediumExisting wiki synthesis
Karpathy pattern separates raw sources, wiki, and schema with ingest/query/lint operationsraw/articles/karpathy-llm-wiki-gist-2026.mdprimary/sourcehighConceptual seed
Markdown+git is an emerging canonical memory patternraw/github/wuphf-repo-readme.md; raw/github/llm-wiki-compiler-repo-readme.mdgithub/readmemedium-highImplementation evidence, not universal benchmark
Context engineering maps to write/select/compress/isolateraw/github/langchain-context-engineering-repo-readme.md; raw/github/langchain-how-to-fix-your-context-readme.mdgithub/readmemedium-highPractical implementation references
Indiscriminate memory storage can create junk and feedback loopsraw/github/mem0-issue-4573-memory-audit-junk.mdgithub issuemediumSingle detailed production case study
Agent-global memory needs conversation/project scopingraw/github/letta-issue-652-per-conversation-context-scoping.mdgithub issuemedium-highConcrete design issue
Hermes built-in memory is bounded and best for compact durable steeringHermes memory docs fetched 2026-05-14product docshighMEMORY.md/USER.md small prompt-injected stores
Obsidian Web Clipper saves web content locally to markdown filesObsidian Web Clipper docs/README fetched 2026-05-14product docs/githubhighUseful capture layer
Obsidian Properties and Bases support structured metadata/database-like views over markdownObsidian Help fetched 2026-05-14product docshighUseful review dashboards
Dataview indexes markdown metadata and queries notesDataview docs fetched 2026-05-14plugin docsmedium-highCommunity plugin, mature but not core
Obsidian Local REST API can expose read/write/search/patch/active-file operations and MCPLocal REST API README fetched 2026-05-14plugin docs/githubmedium-highOptional integration

Current Corrections / Evidence Gaps

  • Obsidian official help is delivered through Obsidian Publish and was fetched via preloaded markdown URLs. Content should be rechecked if implementing exact plugin settings.
  • The plan has not yet inspected the user's actual Obsidian vault path or installed plugins.
  • MCP package commands vary by selected Obsidian MCP server; verify package docs before configuring Hermes MCP.
  • No retrieval benchmark has been run on the user's real notes yet.