Data ≠ Information

- February 22, 2026

I mentioned last night that digitised data is twice abstracted via both hardware and software, but I didn't expand on what I meant by that, so I thought I'd drop a brief note to go some way in explaining what I meant by it. Getting back to the written/printed page as the most successful archival method of human produced stuff yet invented: a page of text in whatever natural language is the perfect wysiwyg - 'what you see you get' - interface. It is a direct, orthographic, symbolic rendering of human language, unmediated by the medium that transmits it. It's longevity, and hence its long term viability and reliability as a data retrieval system, is only governed by the quality of the means of production [paper, ink, binding, etc.] and the stability of the long term storage conditions of the artefacts. No other significant factor comes into play, and as the layer of abstraction of the data is the direct analogue of its source, ie. natural language, and as written language is essentially part of all modern languages, you could argue that this is abstraction layer 0 [zero].

A digitised collection of this written data - for it's principally the written form of language in which we're interested here - has to be stored via two other main abstraction layers - hardware and software; in order to reduce the symbols of language to those symbolic forms that machines can understand and deal with. Neither of these layers of abstraction is one-dimensional, and each involves further sub-strata of abstraction in order to achieve their allotted task. Bear in mind that one major layer without the other simply can't function: on the one hand you have effectively dumb 'machinery' and on the other hand, and without the former, you have essentially nothing at all, as software only exists in that liminal space that is whatever digital device is in question, except when in its printed form, which is pretty ironic in itself.

In fact, an actual computer file - this applies to all digital devices: mainframes, desktops, tablets, phones, etc., etc. - when printed out from a suitable software editor would take the form of the content of the screenshot of a code editor above. A dump of a file consists in a bunch of grouped hexadecimal characters as in the above. On their own and without the intervention of yet more software, not particularly informative, but ultimately decipherable with some effort and knowledge. But that layer of abstraction is itself a layer above the true nature of data storage: hexadecimal is simply a slightly more digestible version of the language that computing devices actually understand: which is pure binary; ones and zeroes, no more, no less; and pretty much incomprehensible on any meaningful level to a human being.

At the hardware level of abstraction, we've got two principle elements: the essentially dumb computing device itself, which can only be brought to life by the software that can't actually exist independently of the hardware itself - nicely circular - and the storage media on which that software and the files we require to be stored are held for retrieval and use, either as the medium by which the hardware is given its ultimate functionality, or the actual information itself we wish to preserve, sort, retrieve and manipulate in the form of data files. Even on the storage hardware layer there are further abstractions of our data, owing to the plethora of methodologies in organising the binary data onto disks, into memory, or whatever medium is in use at the time of storage of our data. Oh, and all of this requires at minimum, an electricity supply to bring the device(s) to life and make our data real and visible to the real world.

Anyhow, I think the point I'm trying to make should be fairly obvious by now: to read a book, you take it from the shelf and open it: your senses and brain do the rest automatically. To read a digitised text, all of the above-mentioned mechanisms [and many more, in reality] have to be in place, co-ordinated, and in good working order. At any point the process of this multi-layer translation from binary to readable output can be completely scuppered by any one of a thousand possible glitches, some rendering the stored data useless forever. OK, you might argue, if, as most of us do, you have automated cloud storage - that we're probably barely aware of, if at all, that backs up all of our stuff just in case of local failure - that we're covered; think again.

The chain of dependencies in our current age of digital 'storage' is now so complex and convoluted that catastrophic failures either due to mistakes or bad actors are becoming increasingly common. And the principal and most fundamental dependency is electrical power. Without juice, none of it functions and data simply disappears into the ether; the irony being that data processing consumes more and more energy at an exponential rate. There will come a time when feeding the monster will simply be impossible, at which point it and our data will die. Time to read a book and value what is truly valuable to us as humans and to stop trying to vicariously transcend our mortal, corporeal selves: at the end of the day, books will be all that's left us to read anyway: might as well make use of 'em...

Search This Blog

Observations from a Hill

Data ≠ Information

Comments

Post a Comment

Followers

Popular posts from this blog

Of Feedback & Wobbles

Messiah Complex

A Time of Connection