Building a Brand Mind: Our LLM Wiki Architecture based on Karpathy's LLM Wiki

We built a persistent LLM wiki for brands that compounds knowledge instead of rediscovering it. Here's how it works and why it beats RAG.

Dana Willow

Senior Marketer sharing 15 years of marketing wisdom through an AI lens.

Published on April 21, 2026

Updated on April 21, 2026

7 min read1400 words

How we build the LLM Wiki for brands using Karpathy's idea (our brand mind)

I spent 3 months feeding customer calls, slack threads, and product docs into ChatGPT's file upload. Every question meant watching the same documents get retrieved and re-processed. The LLM was thinking in circles.

The problem wasn't the tech. RAG works fine for one-off questions. But brands don't ask one-off questions. You're building a picture of your market, your users, your positioning over weeks and months. You need the knowledge to stick.

So we built something different. A persistent wiki that an LLM maintains, where your brand knowledge compounds instead of getting rediscovered every time.

Why RAG Doesn't Solve This

Most LLM document tools (NotebookLM, ChatGPT uploads, standard RAG systems) work the same way: you drop files in, ask a question, the system retrieves relevant chunks and generates an answer. Clean. Simple.

But there's no memory between queries. Ask how your brand voice differs across 5 customer segments and the LLM has to find and synthesize those fragments fresh. Every single time.

Nothing accumulates. The system treats your 50th question the same as your first. If you've already asked about competitor positioning 3 times, surfaced contradictions in your messaging docs, and refined your thesis, none of that context persists.

According to research from Stanford's HAI Institute, knowledge retrieval without synthesis costs organizations an average of 47 minutes per complex query when counting re-analysis time across teams. That's 8 hours a week for a team asking 10 million questions.

The Core Architecture

We use 3 layers. Raw sources sit at the bottom (immutable truth). The wiki lives in the middle (LLM-maintained synthesis). A schema file at the top tells the LLM how to behave.

Raw sources are your untouched documents. Customer interview transcripts, competitor analyses, brand guidelines, product specs. These never change. The LLM reads from them but can't modify them.

The wiki is where synthesis happens. Entity pages for each competitor, concept pages for your positioning themes, comparison tables, timeline summaries. All markdown files that the LLM writes and maintains. When a new source arrives, the LLM updates 10-15 wiki pages to integrate that knowledge.

The schema (we call it BRAND_MIND.md) defines the structure. What page types exist, what conventions to follow, how to handle conflicts between sources, when to create new pages vs. update existing ones.

This isn't a chatbot interface with memory. It's a codebase where the LLM is the programmer and your brand knowledge is the code.

Three Operations That Matter

Ingest is how sources enter the system. You drop a competitor's product page into raw sources and tell the LLM to process it. The flow: read the source, pull out key claims, update the competitor entity page, update relevant positioning concept pages, note contradictions with your existing analysis, log the ingest.

I prefer doing this one source at a time. The LLM shows me what it extracted and what pages it's updating. I can steer it (emphasize their pricing strategy, ignore the generic feature list). Takes 5 minutes per source but the quality is higher.

Query Becomes Creation

You ask questions against the wiki and get synthesized answers with citations. But here's the twist: good answers get filed back into the wiki as new pages.

You ask "how does our brand voice differ between enterprise and SMB messaging?" and the LLM generates a comparison table with examples from 8 documents. That table becomes a permanent wiki page. Next time someone needs it (or a related question comes up), that synthesis already exists.

Your explorations compound in the knowledge base just like your source documents do.

Health Checks Matter

Every 2 weeks we run a lint operation. The LLM scans for contradictions between pages, stale claims that newer sources have invalidated, orphan pages with no inbound links, concepts mentioned but lacking dedicated pages.

It's boring maintenance work that humans hate doing. The LLM doesn't care. It'll happily check 200 cross-references and flag 6 that need attention.

The Filing System

Two files keep everything navigable. The index.md lists every page with a one-line description and metadata. Organized by type (entities, concepts, comparisons). The LLM updates it on every ingest.

When answering queries, the LLM reads the index first to find relevant pages, then drills into them. Works well up to ~100 sources and hundreds of pages without needing vector search infrastructure.

The log.md is chronological. Append-only record of every ingest, query, and lint pass with timestamps. Parseable with basic grep commands. Gives you a timeline of how your brand knowledge evolved.

How We Use It at PostKing

Every customer call transcript goes into raw sources. The LLM extracts pain points, feature requests, objections, and files them into the appropriate wiki pages.

We have entity pages for each customer segment (indie founders, SaaS teams, agencies). Concept pages for positioning themes (authenticity vs. automation, brand voice consistency, distribution challenges). Comparison pages for how our messaging performs across channels.

When we're writing new landing page copy, we query the wiki: "what objections do agencies raise about automated content?" The LLM synthesizes 12 call transcripts and 4 support threads into a prioritized list with direct quotes.

That synthesis becomes a new wiki page titled "Agency Objections: Brand Authenticity." Next time we need it (writing an email sequence, updating FAQs, briefing the sales team), it's already compiled and cross-referenced.

Real Impact on Content Strategy

Before this system, creating a content brief meant searching Slack, digging through Google Docs, and hoping you remembered which customer said what. Took 90 minutes and you'd still miss context.

Now it's 8 minutes. Query the wiki for the topic, get a synthesis with citations, and start writing. The knowledge is already organized, cross-referenced, and current.

We shipped 23 blog posts in Q1 2026 vs. 11 in Q4 2025. Same team size. The bottleneck wasn't writing, it was gathering context. The wiki solved that.

Implementation Details

We use Claude Code as the LLM agent and Obsidian as the reader interface. The LLM makes edits in one window, I browse the results in Obsidian on the other screen. The graph view shows what's connected to what and surfaces orphan pages.

Everything's stored as markdown files in a git repo. Version history and branching come free. When a team member wants to see a different positioning angle, they branch the wiki and let the LLM synthesize that alternative view without touching the main branch.

The schema file evolves with your needs. Started at 200 lines defining basic structure. Now it's 600 lines with specific workflows for customer feedback, competitor updates, and content brief generation.

Why Maintenance Actually Happens

Humans abandon knowledge bases because the bookkeeping grows faster than the value. You add 10 sources and now you need to update 40 cross-references, reconcile 3 contradictions, and reorganize 2 category pages.

The LLM doesn't get bored. Doesn't forget to update a link. Can touch 15 files in one pass while maintaining consistency. The maintenance cost is effectively zero.

Vannevar Bush described this vision in 1945 with his Memex concept (a personal knowledge store with associative trails between documents). He couldn't solve who does the tedious filing work. Turns out LLMs are perfect for exactly that kind of structured grunt work.

Getting Started

Start small. Pick one domain (customer feedback, competitive intel, content research). Gather 10-15 sources. Write a basic schema defining 3-4 page types. Feed it to Claude Code or your preferred LLM agent and ingest one source.

Watch what the LLM creates. Adjust the schema based on what works. Add more sources. The structure will become clearer as you go.

You're not building a perfect taxonomy upfront. You're creating a system that evolves with your understanding. The LLM handles the reorganization when your mental model shifts.

For brand knowledge building, start with customer interviews and competitor positioning docs. Those two categories give you enough material to see the compounding effect within a week.

The goal isn't to automate thinking. It's to automate the bookkeeping that makes accumulated thinking actually usable. You curate sources, ask questions, and decide what matters. The LLM files everything so it's findable later.

Your brand mind gets richer with every conversation you have, every source you read, and every question you ask. The knowledge sticks.

About Dana Willow

Author

Senior Marketer sharing 15 years of marketing wisdom through an AI lens. Teaching founders to automate smarter.