Chronicle

Why Tags Collapse at Scale

Every tagging system drifts. Here's what happens when it does, and what comes after tags.

I have never seen a tagging system survive contact with reality. If you’ve maintained a personal knowledge base for more than a year, you already know what I mean. You start with enthusiasm. You end with a graveyard of labels nobody searches.

Every tagging system follows the same lifecycle: optimism, drift, proliferation, abandonment.

In the beginning, you create a few clean tags. #biology, #project-alpha, #reading-notes. It feels good. You are organized. You are the kind of person who has a system.

Then six months pass. You’ve added #bio, #biology-notes, #mol-bio, and #molecular-biology. You’re not sure which ones you used where. You don’t check. You keep writing.

By year two, you have three hundred tags. Searching by tag returns either everything or nothing useful. The system is still there. You just don’t use it anymore.


The design flaw is temporal. Tags require you to predict how you’ll search before you know what you’ll write.

You’re in the middle of capturing an idea, something half-formed, maybe a connection between two papers. And the system asks you to stop and answer: What category does this belong to? You don’t know yet. That’s the whole point of writing it down.

“Tags… tagging notes often isn’t practical at early stages of discovery,” one person on our waitlist put it. Another: “Once I’m backlogged without having organized up front, I’m screwed; notes may as well go into a blackhole.” The organizational tax is due at the worst possible moment, when the idea is fresh and fragile and you should be thinking, not filing.


The obvious response is automation. Let the machine tag for you.

This makes the problem worse faster. Auto-tagging doesn’t solve tag drift. It accelerates it. Three hundred tags in three months instead of two years. The taxonomy was never coherent in the first place. Now it’s incoherent at machine speed.


There’s a deeper issue. Even a perfectly maintained tag system has a ceiling, and it’s a low one. Tags are flat labels. They tell you what something is about. They don’t tell you how things relate.

Say you’re researching a compound. It connects to a plant species. It binds to a receptor. That receptor is part of a signaling pathway. The pathway is implicated in a disease. The disease was the subject of a paper you read last month, which referenced a clinical trial you bookmarked.

A tag like #pharmacology captures almost none of this. Even a careful set of tags like #compound, #receptor, #pathway gives you buckets. What you actually need is the thread that runs between them. Folders force hierarchy. Tags force flatness. Neither captures the topology of how ideas actually connect.


So what comes after tags? When you write that a compound binds to a receptor, that is the connection. It’s sitting right there in your sentence. The question is whether your tool can see it.

Tags categorize. Relationships connect. “This note is about pharmacology” is a tag. “This compound activates this receptor, which modulates this pathway” is a web of meaning.

You still want manual links for intentional connections. When you write [[something]], you’re saying these two ideas belong together and I know why. But you can’t link to a note you’ve forgotten exists. The interesting territory is what happens when you combine intentional links with relationships that surface automatically, based on what your notes actually say.


Your notes already contain the structure. It’s in the words you chose, the references you made, the proximity of ideas on the page. The work isn’t inventing a taxonomy and imposing it. The work is making the existing structure visible.

That’s a harder problem. It’s also the right one.