LLM Wiki Schema
This repository follows the LLM Wiki pattern: raw sources are immutable, the wiki is the maintained synthesis layer, and this file is the schema that tells the model how to behave as a disciplined maintainer rather than a generic chatbot.
This repository intentionally keeps machine-oriented configuration out of the vault itself. Global defaults, including the vault root, belong in ~/.llm-wiki/config.json; this file is for wiki maintenance behavior.
Architecture
There are three layers:
raw/: source documents whose substance should remain stable even if paths are reorganizedwiki/: LLM-maintained markdown pagesAGENTS.md: the schema that defines structure and workflow
The vault root itself should be maintained as a git repository.
Two top-level files support the wiki:
index.md: the catalog of durable pageslog.md: the chronological record of ingests, queries, and lint passes
Core Idea
Do not treat this repository like plain RAG over a pile of files.
The goal is not to rediscover the same knowledge from raw sources every time a question is asked. The goal is to incrementally build and maintain a persistent wiki that compounds over time.
When a new source arrives, integrate it into the existing wiki. When a useful answer is produced, preserve it if it belongs in the knowledge base. When the wiki drifts, repair it.
Operating Rules
- Do not rewrite the substantive content of files inside
raw/. - Raw taxonomy refactors may update only path-dependent metadata or local links needed to keep moved raw files navigable.
- Read
index.mdbefore broad ingest, query, or lint work. - Prefer updating an existing page over creating a near-duplicate.
- Keep uncertainty, disagreement, and superseded claims visible.
- Use relative markdown links.
- When a local markdown link target contains spaces, wrap the destination in angle brackets so the link remains clickable.
- Do not use absolute local filesystem paths in repository markdown; keep in-repo links relative.
- Keep repository markdown renderable in the local VitePress site; if a syntax pattern breaks rendering or build output, repair it in the same pass.
- In markdown prose, do not leave raw angle-bracket placeholders such as
<name here>exposed unless they are intentional HTML; wrap them in code spans or escape them as entities. - Do not keep transient browser-only embeds such as
blob:URLs in repository markdown; replace them with stable links, screenshots, or other render-safe representations that preserve the source meaning. - Update
index.mdwhenever you add a durable page or materially change the catalog. - Preserve older log history, but insert new entries at the top of the entry list.
- Treat
index.mdandlog.mdas part of the working memory of the vault. - If the vault root is not yet a git repository, initialize it.
- After any material vault change, make an intentional git commit unless explicitly told not to.
- As the corpus grows, reorganize wiki subdirectories when needed instead of letting one directory become a flat dump.
- As the raw corpus grows, reorganize
raw/subdirectories when the folder stops being navigable. - When moving markdown pages to refine taxonomy, repair all impacted links in the same pass.
- When moving raw files to refine taxonomy, repair all impacted links and raw-local navigation metadata in the same pass.
Language Conventions
This wiki is bilingual in English and Chinese.
- English pages use the canonical unsuffixed path, for example
wiki/topics/example.md. - Chinese pages use the sibling
.zh.mdpath, for examplewiki/topics/example.zh.md. - Markdown raw source translations use the same convention, for example
raw/source.mdandraw/source.zh.md. raw/*.zh.mdmust be faithful translations of the source, not summary rewrites, abridgements, or editorialized notes.- Visual raw assets such as screenshots can be shared across languages without duplicate
.zhimage siblings. - Human-facing repo docs follow the same convention, for example
README.mdandREADME.zh.md,docs/usage.mdanddocs/usage.zh.md. - The root catalog uses both
index.mdandindex.zh.md. - Every durable wiki page, including
wiki/sources/pages, should have both language variants. - Put a language switch link directly under the title on both variants.
- Markdown raw source files should also have a language switch link near the top of the document.
- When a translated raw file exists, source pages should expose both the original raw file and the translated raw file.
- If direct markdown links to raw files prove brittle in the local VitePress site for a given source family, prefer stable code-style raw path display over repeatedly reintroducing fragile jump links.
- When possible,
Related Pagesshould link to pages in the same language. - When a page changes materially, update both language variants in the same maintenance pass.
Translation Freshness
- Treat translations as maintained siblings, not one-time exports.
- Treat raw translations as translation artifacts, not synthesis surfaces; summaries, interpretations, and reorganized takeaways belong in
wiki/, not inraw/*.zh.md. - If an English wiki page changes materially, update the
.zh.mdsibling in the same pass when feasible. - If an English raw source changes or is replaced, update its
raw/*.zh.mdsibling in the same pass when feasible. - If a screenshot or other visual raw asset changes materially, update the bilingual source pages that interpret it in the same pass when feasible.
- If a translation cannot be updated immediately, mark the translated file with
translation_status: stalein frontmatter. - Remove
translation_status: staleonce the translated sibling is current again.
Operations
Ingest
When processing a source from raw/:
- read the source directly
- if the source is a screenshot or other image file, run
.codex/skills/llm-wiki/scripts/extract_visual_sources.pyfirst and use the OCR output as supporting evidence - if a markdown raw source depends on local screenshots or diagrams, extract their OCR text when the visuals materially affect the synthesis
- for Chinese or mixed-language screenshots, prefer installed Chinese OCR languages such as
chi_sim+engorchi_tra+eng - create or update its page in
wiki/sources/ - revise any affected concept, entity, or topic pages
- note where new material strengthens, weakens, or contradicts existing claims
- update
index.md - insert an
ingestentry near the top oflog.md - commit the resulting change when the vault was updated
Ingest is integration work, not mere summarization. A single source may require changes across multiple wiki pages.
Before creating a new long-lived page, check whether an existing page should absorb the material instead.
Query
When answering questions:
- start from
index.md - read only the pages needed to answer well
- answer from the wiki when the wiki already contains the synthesis
- revisit raw sources only when the question requires deeper verification or the wiki appears incomplete
- if the answer creates durable knowledge, file it back into
wiki/answers/or update the relevant long-lived page - insert a
queryentry near the top oflog.mdwhen the vault changes materially - commit the resulting change when the vault was updated
Useful answers should compound into the wiki instead of vanishing into chat history.
Lint
Periodically maintain the wiki:
- look for contradictions between pages
- look for stale claims that newer sources have superseded
- look for orphan pages and missing cross-links
- look for important concepts mentioned repeatedly but lacking their own page
- look for gaps that suggest the next source to ingest or next question to ask
- look for markdown rendering hazards, especially malformed links, unmatched HTML tags, transient
blob:embeds, broken fences, or escaped backticks that expose raw angle-bracket placeholders - look for obvious taxonomy drift in
raw/andwiki/, especially flat top-level files that already belong to a clear source family or navigation cluster - treat routine cleanup requests as including a taxonomy check by default, not only hygiene work like ignores or build leftovers
- when taxonomy drift is obvious, prefer landing one or two high-confidence regroupings before falling back to hygiene-only cleanup
- repair render-breaking markdown with the smallest syntax-only edit that preserves source meaning, especially in
raw/where substance should remain stable - insert a
lintentry near the top oflog.mdwhen maintenance work changes the vault materially - commit the resulting change when the vault was updated
Lint is an editorial pass over the whole knowledge base, not just a structural check.
Taxonomy Refactors
When directory structure stops fitting the corpus:
- split broad folders into narrower subdirectories
- trigger when a stable family, topic slice, or navigation pattern is already obvious enough to deserve its own subtree
- treat navigability and family clarity as the primary trigger, not raw file counts
- prefer moving pages over cloning them
- rewrite all affected links in the same pass
- update
index.md,index.zh.md, and any overview pages that expose the moved pages
Raw Taxonomy Refactors
When raw/ stops fitting the corpus:
- split broad folders into narrower subdirectories
- trigger as soon as a stable source family is clear enough that grouping it would make navigation better
- treat “these files obviously belong together” as a stronger signal than any directory size number
- during routine cleanup, inspect for one or two obvious raw-family regroupings before defaulting to superficial hygiene work
- prefer narrow, high-confidence regroupings over broad uniform renesting of every file in sight
- preserve source substance; only repair links or path metadata needed because files moved
- move original raw files and their
.zh.mdtranslation siblings together when both exist - rewrite impacted links in
wiki/, repo docs, indexes, and affected raw markdown files in the same pass - prefer using
.codex/skills/llm-wiki/scripts/move_raw_sources.pyfor safe moves with link repair
Index And Log Conventions
index.md should stay content-oriented:
- group pages by category
- keep each entry to a link plus a one-line description
- make it easy to decide which pages to read next
log.md should stay reverse-chronological, with the newest entries first.
log.md is an operations record for vault work, not a repository changelog.
- Record only
ingest,query, andlintpasses that materially changed the vault. - Do not use
log.mdfor code-only, script-only, config-only, or release-note style entries when the durable wiki state did not change. - Scheduled maintenance should update
log.mdonly when it actually lands a vault change; a no-op patrol should leavelog.mduntouched.
Preferred heading shape:
## [YYYY-MM-DD HH:MM] ingest | Source Title
## [YYYY-MM-DD HH:MM] query | Topic
## [YYYY-MM-DD HH:MM] lint | Maintenance summaryNew log entries should be precise to the minute.
Page Roles
wiki/sources/: source-oriented notes tied to raw fileswiki/concepts/: reusable ideas, frameworks, and abstractionswiki/entities/: people, organizations, tools, products, and projectswiki/topics/: broader synthesis spanning many pageswiki/answers/: durable outputs created during query work
Style
- Start pages with a short summary.
- Prefer compact sections over long prose.
- Use explicit sections when they help with maintenance:
Summary,Key Claims,Visual Notes,OCR Excerpts,Related Pages,Sources,Open Questions. - Keep pages interlinked.
- Let pages accumulate synthesis over time; do not rewrite them as isolated snapshots.
- Keep English and Chinese sibling pages structurally aligned enough that switching languages stays fast and predictable.
Git History
- Keep commits small and intentional.
- Commit because the knowledge base changed in a meaningful way, not merely because a file timestamp moved.
- If a commit convention applies in the surrounding environment, follow it.
Git Remotes
This repository uses one remote:
origin: GitHub, the source of truth
Rules:
- Before starting material work, check whether
originhas newer commits and sync localmasterfromorigin/masterif needed. - Treat
originas authoritative when the remotes diverge unless the user explicitly says otherwise. - Before pushing changes that affect the VitePress site, run a local
pnpm run docs:buildand make sure it succeeds. - Do not push a VitePress-affecting change to
originif the local site build is failing. - After any local commit, push to
origin. - Do not leave committed local changes unpublished on
originunless a push fails.
Browser Automation
- Never use Safari for browser automation or manual browser-directed tasks on this machine.
- Prefer Google Chrome for browser-driven work.
- On this machine, the installed Chrome app is
/Applications/Google Chrome 2.app. - If a tool cannot find the default Chrome path, explicitly fall back to
/Applications/Google Chrome 2.appbefore trying any other browser. - When a task says to use the already-open browser, prefer attaching to or scripting the existing Google Chrome session rather than opening Safari or switching browsers.
Git Identity
- This repository uses a repository-local GitHub-facing identity by default.
- The local git identity for this repository should stay
csh2022 <1065521866@qq.com>. - Keep this identity scoped to this repository only; do not change global git
user.nameoruser.email. - Use the repository-local identity for normal commits and pushes to
origin, including commits that trigger Vercel deployments. - If the repository-local git identity drifts away from
csh2022 <1065521866@qq.com>, restore it before making or pushing new commits.