Why should you create personal knowledge bases for quicker responses to queries? Andrej Karpathy explains

Artificial intelligence (AI) researcher Andrej Karpathy has outlined a new way of using large language models (LLMs) to build personal knowledge bases, moving away from the more widely used retrieval-based approach toward a more persistent, self-improving system in search and responses generated by AI.

Karpathy was a founding member at OpenAI and had also worked as a former director of AI at Elon Musk’s Tesla. Karpathy founded his own firm, Eureka Labs, in 2024.

The idea, shared by Andrej Karpathy on X this week, has quickly gained traction, drawing praise from Twitter cofounder Jack Dorsey, who described it as a “great idea file.” Within two days, Karpathy’s posts amassed around 16 million views, with more than 100,000 users bookmarking the two series of posts, and many beginning to build their own personal wiki systems based on the concept.

What is the problem?

Karpathy described how most current LLM workflows rely on retrieving relevant chunks of information from uploaded documents at the time of a query, a method commonly referred to as retrieval augmented generation (RAG).

RAG improves LLM outputs by combining them with external data. The framework involves first retrieving relevant information from sources like databases or the web, cleaning and processing it, and then feeding it into the model to guide its response. This helps the LLM generate answers that are more accurate, up-to-date, and grounded in facts, reducing hallucinations.

Modern RAG systems use advanced search methods such as semantic search and vector databases to find the most relevant information quickly and efficiently.

RAG is widely used in chatbots and enterprise tools where access to current or specialised information is important. By connecting LLMs to external knowledge sources, it enables more accurate, context-aware, and scalable AI applications.

While effective, this approach requires the model to rediscover and reassemble knowledge each time a question is asked, limiting the accumulation of insight over time.

What is the solution? How does it work?

Karpathy proposes an alternative model in which the LLM builds and updates a connected wiki or encyclopedia from a user’s documents. It reads new sources, adds key points, updates pages, and links related ideas, improving the knowledge base over time.

“Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki…The knowledge is compiled once and then kept current, not re-derived on every query,” he added.

He proposes that the system would have three main layers. The first one would include all the raw and original sources, like articles, papers, and data files that the LLM reads but never changes. The second is the wiki, a collection of plaintext document pages created and maintained entirely by the LLM, where it writes summaries, links ideas, and keeps information updated and organised. The third is the schema, a set of instructions that guides the LLM on how to structure the wiki and handle tasks like adding new sources or answering questions, helping it act consistently and systematically rather than like a general chatbot.

Over time, this wiki becomes richer and more comprehensive, enabling the LLM to answer complex queries by drawing on an already organised body of information rather than reconstructing it from scratch.

How does this change a user and an LLM’s roles?

In this framework, the user’s role shifts to curating sources and guiding inquiry, while the LLM handles summarisation, cross-referencing, and maintenance. Karpathy even proposes using tools such as Obsidian, which can be used as an interface to browse the evolving wiki, visualise connections, and review updates in real time.

Obsidian is a local-first note-taking utility tool that uses Markdown files to create a networked database of thoughts. A .md or a Markdown file in AI is a structured, easy-to-read document used for providing context, instructions, and documentation to AI models, especially coding agents.

Where can this approach be used?

Karpathy proposes users can deploy this strategy to track personal goals, health, or learning by organising journals, articles, and notes into a structured system. Researchers can build a growing knowledge base over time as they read and analyse material. Readers can create detailed companion wikis for books, mapping characters, themes, and plots. Teams and businesses can maintain internal knowledge systems using meeting notes, documents, and conversations.