Learn to Forget to Learn
Forgetting is essential as long as it happens the "right" way. Deliberately choosing not to remember is a tool of a master. What is vital will be (re)acquired.
This past Saturday, I attended a four-year-old kid’s party thrown by friends. It was quite a gathering, with many familiar and new faces among pink ponies and unicorns. One such new face was an SLP or a speech-language pathologist/therapist working with French-speaking patients.
The hook connecting our small talk to a more substantial conversation was my wife’s struggles to get some obscene French language sounds right. She mentioned something I deliberately chose not to remember—synaptic pruning.
Although I’ve only encountered this term in the context of human learning, this phenomenon is characteristic of many mammals.1
The most relevant thing about synaptic pruning is that it’s widely thought to represent learning. From early childhood to puberty, we’re losing some cerebral density. Things absorbed early might be spared, but the “use it or lose it” paradigm kicks in and goes on a synaptic purge.
This doesn’t mean parents should cram too much into infants’ brains as early as possible. Otherwise, we might witness oddities like the ones described below when discussing RAG LLM systems and seniors’ “tip-of-the-tongue” bug.
But let’s discuss information overload first.
Do You Need This?
One of my favourite, very à propos Sherlock Holmes titbits is quoted below. It’s an excerpt I always remember whenever someone becomes upset that I don’t read the news besides the RSS feed of my work tools.
“You see,” he explained, “I consider that a man’s brain originally is like a little empty attic, and you have to stock it with such furniture as you choose. A fool takes in all the lumber of every sort that he comes across, so that the knowledge which might be useful to him gets crowded out, or at best is jumbled up with a lot of other things, so that he has difficulty laying his hands upon it. Now the skillful workman is very careful indeed as to what he takes into his brain-attic. He will have nothing but the tools which may help him in doing his work, but of these he has a large assortment, and all in the most perfect order. It is a mistake to think that that little room has elastic walls and can distend to any extent. Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones.”
“But the Solar System!” [Dr. Watson] protested.
“What of the deuce is it to me?” he interrupted impatiently: “you say that we go round the sun. If we went round the moon it would not make a pennyworth of difference to me or to my work.”2
As an engineer working with vast amounts of data and AI, I find the “less is more” adage very accurate. To ambitious claims that this and that company ingested a gazillion petabytes of data to make it instantly searchable and useable, the first question I’ll ask is, “At what cost?”. By cost, I don’t only mean monetary expenditure. Yes, the ever-cheapening disk storage might trick us into thinking that our attic’s walls are expandable, but the more lumber you bring there, the more the cost of its ownership and the slower its retrieval. Beyond a certain threshold, it gets incredibly challenging to find the right tool. As we’ve previously seen in the issue dedicated to the Iron Triangle, something’s got to give.
So, we’re forced to use additional strategies, such as turning documents into vector spaces for faster retrievals. These strategies inevitably sacrifice accuracy, speed, and cost, which can be prohibitive, as I’ve recently learned from using a few top-of-mind vector databases.
The so-called RAGs or Retrieval-Augmented Generation AI systems use knowledge bases relying on vector databases to generate answers grounded in objective supporting evidence in the form of documents they were fed. RAGs can counteract one of the significant flaws of large language models: the occasional invention of facts, also known as hallucinations. But, as companies started throwing more data on them, they realised the more you feed your RAGs, the less resistant they become to that shortcoming.
Imagine a scenario where a hypothetical RAG ingested a massive corpus of investor relations documents. The answers such a system will give you will be grounded in truth. However, the vector space it creates can lead to similar but unrelated terms being accidentally used together. It wouldn’t be impossible to get an answer with properly attached supporting evidence claiming something like “User retention rate of Microsoft Office 365 for the last quarter increased by 10%, as announced by the CEO Mark Zuckerberg.”
Note: It’s not all doom and gloom. There are ways to reduce the likelihood of the above, such as introducing knowledge graphs, which we often discuss here. GraphRAG is the new cool kid on the block. However, much of the above wouldn’t happen if Sherlock Holmes were your Chief Data Officer.
An overloaded RAG becomes the equivalent of a senior who struggles to remember seemingly basic things, such as the names of people or objects. They’ll often mix them up because of similarity, the equivalent of proximity of vectors in a vector database. It’s on the tip of the tongue, but a mature brain has much more information than a younger one. Seniors have bluntly seen, ate, smelled and thought more. Sorry, (younger) folks, that is another reason to respect their opinions. Granted, the struggle to use this vast knowledge space looks like a bug to a younger fellow.
Food for thought: Could the engineering approach to dealing with data overflow and mixups also be used for human brains? Could the current trend of
‘s or ’s Second Brain, the equivalent of using connected knowledge to counteract hallucinations, be that solution? GraphRAGs are the latest trend; in a way, they’re equivalents of RAGs’ second brains.Press Delete?
It’s human to feel the pain of a loss much more acutely than the joy of a gain. As Warren Buffet says, we tend to cut the flowers and water the weeds in our investment portfolio. The sunk cost fallacy is a potent cognitive bias that might make you cinematically sweat and shake while hovering over the delete button. So, some don’t. And that’s fine, too.
…
There’s a not-very-known interview between the father of Zettelkasten, Niklas Luhman, and Wolfgang Hagen, in which he admits to never pruning his notes even when they become irrelevant. He just leaves them there, unattached as zombie notes. This debre never stopped him from creating ~90000 handwritten notes, ~600 publications and ~60 books.
…
’s PARA method is also a non-destructive approach to information management where items are moved to the Archive instead of removed. This is what the last A in PARA stands for.3 In a way, this out-of-sight, out-of-mind shift could be considered a glorified non-destructive pruning.…
A recent study of the cognitive consequences of having information at our fingertips revealed two interesting facts:4
When we know the information we saw online will remain accessible in the future, we’re less likely to remember it.
Even though we don’t remember it as well, we're still able to approximately remember where we’ve seen it and be able to retrieve it later.
Those two statements also describe how I feel. These days, personal knowledge management tools help streamline the process of keeping bookmarks and highlights with citations, which I come across in this collective memory of the Internet and books we all have access to. Not having to search for something twice has a lot of value for a knowledge engineer.
When things change, and something that used to be just an interesting fact becomes essential, I’ll run a local search and pull out the note or a bookmark, a segway into more profound research.
…
All of the above counter-arguments require active reviewing and maintenance of personal notes, a practice that will make them work regardless of whether you prune them. We’ve discussed the importance of it here:
You could review the entire corpus of your PKM vault or avoid certain zones, such as your PARA archives, but the lack of active periodic reviewing makes some people think note-making doesn’t work.
One thing’s sure: now that it has resurfaced and led to an essay, I will never forget what synaptic pruning is. What was pruned was (re)acquired.
https://en.wikipedia.org/wiki/Synaptic_pruning
Doyle, A. C. (2007). A Study in Scarlet. Modern Library.
https://fortelabs.com/blog/para/
Betsy Sparrow et al., Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science 333,776-778(2011). DOI:10.1126/science.1207745