Store Like an Engineer. Chapter 2: Files
Folders are easy to create and provide a false sense of being organised. It's a slippery slope that might leave you with more chaos on your hands than what you started with.
It’s the weekend again, and I’m in a cleaning/organisation frenzy. Last weekend was dedicated to the cornerstone of Personal Data Lakes storage type: objects. You can read it here if you’ve missed it:
Today, the limelight is on another equally important storage type: files. Luckily, you're likely already familiar with the paradigm of folders and files.
What’s interesting in talking about files and folders is that, with a high degree of probability, these days, the first thing that pops up in your mind are not the actual manila folders and yellow-ish A4 (US letter) sheets of paper. Instead, you’re probably thinking about files and folders in your computer’s operating system. This is funny because operating systems adopted concepts such as a desktop, a trash bin, a contact book, a folder or a file because they were digital twins of their analogue counterparts. Oh, how the tables have turned.
It’s difficult to overestimate the mass adoption of filing cabinets in knowledge work. We discussed them in one of the previous issues:
However, don’t be too superficial about what a folder is. Money, for example, is “easy” to understand, but how many of us truly understand money?
File Storage
Unlike objects thrown into a commonplace box, file storage introduces organisational hierarchy. Files are stored within folders, which are stored within other folders in their turn. This creates a reversed-tree-like information structure you can traverse to find what you want, provided that you don’t take the wrong turn in the process and that you (the current you) and the archiver (the past you) follow a set-in-stone classification logic that can withstand the test of time and the issues described below.
File Linking
One of the most glaring shortcomings of categorising files into separate folders is that one file can only be part of one folder.
Imagine classifying toys (once thrown into a common box) into dedicated drawers:
Electric toys
Non-electric toys
Soft toys
Hard toys
Up to 6-years-old toys
Over 6-years-old toys
Where would you put Winnie the Pooh, which runs on batteries and plays back lullabies? Box 1, 2, 3, or 5?
The answer is: it depends, and unfortunately, it depends on the moment you want to play with Winnie in the future.
David Allen, the author of “Getting Things Done”1, defines organization as finding things where you expect them to be. In the example above, picking any possible options would result in becoming less organized than before. Yes, it used to be one black hole of a box to dig through, but at least we were certain it was somewhere there.
There are several ways of dealing with this, and none of them are beautiful.
Note: Those familiar with the engineering debate of inheritance versus composition will recognise the same concerns in classifying files using folders or tags, respectively. Knowledge engineers should be aware of this important discussion, but we’ll dive into it in another issue.
Duplicates
You can buy three additional Winnie the Pooh dolls and store each in the “right” drawer. Besides the insanity of the associated cost and the waste of space, if you throw one away, you’ll still be left with three units, and your organisation will, once again, become corrupted.
They’re also now four different non-fungible instances of the same toy. Which means that they’ll be evolving separately. If you tore one of his arms because that’s how you like it, other copies remain unchanged. Generally, we want all copies of our files to remain in sync. This is not the case with copies.
If you outgrew the toy and wanted to prune the room, you’d have to open four drawers and throw four things.
Hard Links
You could commit to storing Winnie in the first drawer (electric toys). In the remaining “relevant” drawers, you’d put a note to your future self:
“Pooh is in drawer 1 - Electric Toys. Thank me later!”
—THE PAST YOU
Opening the “wrong” drawer will always require just one extra operation, such as opening the correct drawer to find the toy.
However, removing any of the notes will once again bring chaos into your organisation, and fully pruning the room from Winnie still requires four separate deletions.
Soft Links (aka Symbolic Links)
An alternative way of helping someone else or your future self out is to leave some breadcrumbs:
The first drawer is where the toy is.
The second drawer would have a note saying the toy is in drawer 1.
The third drawer would have a note saying the toy is in drawer 2. You’ll discover that you’ll have to hop once more to get to the toy.
The fourth drawer would have a note saying the toy is in drawer 3. You’ll find out that you’ll have to hop twice more to get to the toy.
Once again, ditching any of the above four elements will leave you with a messier system, which we’ll need to prune four times if we desire a total reset. Removing notes in anything but the last drawer might also break the breadcrumbs’ chain.
Use Cases
Classification results in a dopamine spike: we think we’re improving. Although organizing is a good thing, organizing poorly can backfire.
It is recommended not to “over-tighten” the classification too early. Keep it loose and narrow down over time.
A sound strategy for using folders (even on your computer) would be to store all things relevant to one task. An example would be something as open-ended as “work” or “selfies.”
If your classification has many aspects you’d like to filter things by, you’re better off using metadata, such as tags, instead of file/folder classification. Tagging is not bulletproof, though. There are so many ways of getting it wrong, just as there are ways of getting folder structuring wrong. We’ll get to tags in future instalments.
Pros
If the right classification choices have been made, it provides faster file retrieval without search result “misses”.
Most people are familiar with this storage type, though surprisingly, most are also doing it wrong.
Unlike the single open box full of objects, you can restrict access to certain folders or drawers, giving you more control over who gets to what.
You can move files from one place to another as your classification changes. Things put into a new cabinet will automatically benefit from the characteristics of that new repository, such as more granular access control.
Cons
There’s no way to classify the same file into two different folders without some sacrifices. Either make a copy or link. Repeat the process each time you want to store “the same” thing in a new place or delete it altogether.
It’s an organizational treadmill. Your classification needs will likely change over time, and leaving the storage unattended might result in a hierarchy that’s out of sync with your expectations.
It’s difficult to find the right balance. Too specific, and you’ll end up with files that belong in different folders or remain almost empty. Too broad, and we’re back to the object storage class where our folders are as vast as the gigantic flat storage bucket.
Next Up: Blocks
Our next stop is blocks.
(Pun intended) These are the building blocks of efficient digital storage and your Personal Knowledge Vault.
In the meantime, consider how you’re organizing your pantry or wardrobe. Are you over-tightening your “file” storage?
Allen, D. (2015). Getting Things Done: The Art of Stress-Free Productivity. Penguin.
Здравствуйте!
Спасибо за статью.
А вам приходилось когда-нибудь пользоваться своими записями?
У меня они, в основном, только для записи и все.
Какой способ ведения записей