Usage
中文 | English
Repository Layout
Core structure:
llm-wiki/
├── AGENTS.md
├── index.md
├── index.zh.md
├── log.md
├── raw/
└── wiki/Meaning:
raw/Raw source layer. It can contain markdown, screenshots, and local image assets. Source substance should stay stable, though path-level reorganization is allowed when links and path metadata are repaired in the same pass.wiki/Model-maintained knowledge layer, includingsources/,topics/,concepts/, andanswers/.AGENTS.mdThe schema that defines how the model maintains the knowledge base.index.md/index.zh.mdBilingual catalog entry points.log.mdMaintenance history for ingest, query, and lint work.
Day-to-Day Use
Manual ingest
After placing a source in raw/, in a new Codex session:
$llm-wiki ingest raw/xxx.md
Read AGENTS.md and index.md first, then update wiki/sources/, the relevant concept/topic/entity pages, and finally refresh index.md, index.zh.md, and log.md.If the source is a screenshot or image file, first extract OCR text:
python3 .codex/skills/llm-wiki/scripts/extract_visual_sources.py --repo . --path raw/screenshots/foo.png --lang chi_sim+engThen ask Codex to ingest the screenshot as a source and fold the OCR-backed evidence into wiki/sources/ and the affected long-lived pages.
Manual query
$llm-wiki answer this question: ...
Start from index.md / index.zh.md, read only the necessary pages, and if the answer is durable write it back into wiki/answers/ or merge it into an existing page and update log.md.Manual lint
$llm-wiki run a full lint pass on this vault
Check contradictions, stale conclusions, orphan pages, missing cross-links, missing translations, stale translations, and commit if the pass makes material changes.Raw taxonomy refactor
When raw/ becomes too flat, move files or folders without changing source substance:
- Trigger when a stable source family is already clear enough that grouping it will make navigation better.
- Prefer regrouping by source family over waiting for a directory-count threshold.
- Avoid creating one-file directories unless they clarify a real long-lived family boundary.
python3 .codex/skills/llm-wiki/scripts/move_raw_sources.py --repo . --move raw/foo.md:raw/articles/foo.mdThen run lint to catch any remaining broken links immediately:
python3 .codex/skills/llm-wiki/scripts/lint_vault.pyBatch screenshot OCR
When the vault accumulates many screenshots, batch-extract OCR before an ingest or lint pass:
python3 .codex/skills/llm-wiki/scripts/extract_visual_sources.py --repo . --all-raw-imagesCheck available OCR languages first when screenshots are multilingual:
tesseract --list-langsThis repo now expects Chinese OCR support to be available locally. On macOS with Homebrew, install the language pack with brew install tesseract-lang, then prefer --lang chi_sim+eng for Simplified Chinese screenshots or --lang chi_tra+eng for Traditional Chinese screenshots. If chi_sim or chi_tra is installed, the helper now defaults to a compact Chinese-first OCR bundle even when --lang is omitted.
Bilingual Rules
The whole llm-wiki is bilingual.
- English pages use canonical paths.
- Chinese pages use sibling
.zh.mdfiles. index.mdandindex.zh.mdare the bilingual root catalogs.- All durable pages should exist in both languages, including
wiki/sources/andwiki/answers/.
raw is bilingual too
Markdown raw files also use sibling translations:
- source:
raw/foo.md - Chinese translation:
raw/foo.zh.md
The raw Chinese sibling must be a faithful translation of the original source. Do not compress it into a summary, rewrite it into cleaner notes, or mix translation with editorial takeaways. That synthesis belongs in wiki/.
Both should expose a quick language switch near the top:
[中文](<foo.zh.md>) | English中文 | [English](<foo.md>)source pages link both raw variants
Each wiki/sources/*.md page should link:
- the original raw file
- the translated raw file
Standalone screenshots and other visual assets do not need duplicate .zh image siblings. Their bilingual interpretation should live in the source pages instead.
Translation Freshness
Translations are maintained siblings, not one-time exports.
- Keep
raw/*.zh.mdloyal to the source text. If the current Chinese file behaves more like a summary than a translation, treat it as incorrect and bring it back in line with the original. - When the source page changes materially, update its translation in the same pass when feasible.
- When a raw source changes materially, update
raw/*.zh.mdin the same pass when feasible. - If a translation cannot be synced immediately, mark it in frontmatter:
translation_status: stale- Remove the marker once the translation catches up.