Your Connected Notes Hide Stories You Probably Don't See
How to read between the lines of interconnected information.
My colleague messaged me to ask for my opinion. He depicted a part of our data model as a so-called Entity-Relationship Diagram or ERD. It’s a handy diagramming convention that allows you to express things that are meaningful to your business and their relationships with each other.
His architecture had two entities, a Book and an Author, and looked something like this:
In the diagram above, the relationship, expressed in “chicken leg notation”, indicates that an Author could have written zero or many Books. A Book, however, should always have at least one Author.
This could have made sense if we designed software dedicated solely to authorship. But what if we also wanted to capture information on CEOs of various companies? We could model it like so:
This works well until, one day, the CEO decides to write a Book. What should we do if that happens?
There are several ways of dealing with this, and they all stink. We could:
Create a duplicate for the CEO in the Author table and connect him to his new Book.
Do the opposite: copy the Author’s record into the CEO table and add his Company as a connection.
Create a new connection between the Author and the CEO entities as an identity bridge.
Even though it looks like we’ve solved the problem, we’ve introduced bigger ones. By doing those things, we’ve either duplicated information or created unnecessary plumbing. As a result, we now have to make sure our duplicates don’t get out of sync, and we’ll have to maintain the Author ←→ CEO link as a bonus. On top of that, it’s still not evident that two records in two different places represent the same person. What if the Company goes bankrupt? What if the person’s no longer the CEO?
A Better Way
Early optimisation is the root of all evil.
—SEASONED ENGINEERS
Don’t overtighten your design. An Author or a CEO is nothing but a particular case of a more generic Individual.
The only thing that turns an Individual into an Author is if he wrote (←→) a Book; the only thing that turns an Individual into a CEO is if he established (←→) a Company.
The presence or absence of a connection conveys a lot of information.
Now, we can add or remove entities without butchering the model, and we’ll never lose data in the process. The facts about the Individual are not limited to the entity itself. A lot of them are hidden in their relationships with the environment.
Try replacing entities with your notes; the same principles still apply.
When your knowledge management system matures and you transition to the fun part, connecting them, it’s easy to overlook that links can convey as much information as the notes you’re connecting. There are instances when the information belongs not in the note but in its relationship to other notes.
Don’t make anything too specific too early in the process. Don’t prematurely overtighten your classification, such as folders, tags, connections, and naming conventions. The absence of information is itself information. Pay attention to relationships and the intel they hide.
Read between the lines.