AI’s next bottleneck is not context size. It is stewardship.
The agent era needs maintained knowledge environments: metadata, curation, preservation, source quality, permissions, freshness, and memory hygiene.
Machine Room Claims AI’s next bottleneck is not context size. It is stewardship. A larger context window can hold more text, but it does not know what deserves to be there. A retrieval system can fetch more documents, but it does not automatically understand which ones are stale, duplicated, misleading, private, low-quality, or badly described. An agent can read a folder, search a database, and summarize a pile of sources, but someone still has to decide what the pile means. That is the old library problem, wearing a new machine face. The internet has spent years optimizing for storage and access. Save everything. Search everything. Index everything. Now AI systems are making that bargain feel incomplete. If machines are going to reason over our shared knowledge, then the quality of their work depends on the quality of the environments we give them. Receipts are not libraries. Provenance can tell an agent where something came from, who signed it, or what touched it. That matters. But provenance alone does not curate. A receipt does not decide whether a document belongs in the collection, whether it has been superseded, whether it is the best available source, or whether it should be shown to a particular user. That work is librarianship. Not librarianship as nostalgia. Not the quiet stereotype of shelves and stamps. Librarianship as infrastructure: metadata, appraisal, preservation, access, authority, context, rights, classification, citation, and care. A good library is not just a warehouse of books. It is a maintained knowledge environment. The catalog matters. The subject headings matter. The preservation rules matter. The provenance of records matters. The decision to keep, discard, repair, describe, restrict, or surface an item matters. AI systems are rediscovering this through RAG — retrieval-augmented generation. The basic idea is simple: instead of relying only on what a model learned during training, connect it to an external knowledge store and let it retrieve relevant material. But the hard part is not the acronym. The hard part is the collection. If the collection is messy, retrieval is messy. If the chunks are stripped of context, the model receives fragments without their original meaning. If the sources are stale, the answer can be stale. If duplicate documents conflict, the system may treat noise as consensus. If metadata is weak, useful material stays buried. RAG did not make the library problem disappear. It made the library problem executable. Context windows create a similar illusion. It is tempting to think that if a model can read a million tokens, curation becomes less important. Just put everything in. Let the model figure it out. But context is not an archive. Context is closer to working memory. It is temporary, crowded, and shaped by attention. Research on long-context models shows that information can be missed or underused depending on where it appears. Bigger windows help, but they do not turn disorder into understanding. A warehouse is not a library because it is large. A context window is not a knowledge system because it is long. The machines need librarians because knowledge has a lifecycle. Information is created, described, revised, challenged, archived, retracted, forgotten, restored, and reused. The Digital Curation Centre’s lifecycle model captures this better than most AI diagrams: curation is not a one-time ingestion step. It is ongoing care. That care becomes even more important when agents have memory. Agent memory is powerful, but it can rot. It can preserve bad assumptions. It can overfit to old preferences. It can remember something private in the wrong context. It can merge temporary instructions with durable facts. It can confuse a passing mood for a stable decision. Memory hygiene is librarianship by another name. A useful agent should not merely remember. It should know what kind of memory it is holding. Is this a user preference, a project decision, a source excerpt, a secret, a task, a draft, a correction, or a disputed claim? When was it recorded? Who approved it? What supersedes it? When should it expire? Those are catalog questions. The FAIR principles — findable, accessible, interoperable, reusable — were written for scientific data, but they map cleanly onto agent knowledge environments. If a machine cannot find a record, the record might as well not exist. If it can find the record but cannot interpret it, the record becomes friction. If it can interpret the record but cannot judge its scope, the record becomes dangerous.