Desktop version

Home arrow Engineering arrow The dark side of technology


A ‘law’ of the speed of written information loss

As a scientist, I gain confidence in an idea if I can see a definite pattern in the data, and even more so if there is a law that quantifiably describes what is happening, both now and in the future. There are also familiar examples of so-called laws that are actually just reporting trends. In electronics and computing, there is the famous ‘Moore’s law’, which initially noted that the number of transistors on a computer chip was doubling every 24 months (with improvements, this speeded up to every 18 months).

Very similar patterns are cited for the improvements in other related technologies. Examples include the number of pixels per dollar on a CCD camera chip, or the amount of computer memory. All are increasing exponentially with time. For the CCD chip, the improvement rate has been around ten times per year, and for computer memory the value is nearer 100 times per year. In the technology of optical fibre communication, the number of signal channels, data rates, and distance are all important and interlinked. For fibre optics, the product (signal capacity X distance) has expanded by around ten times per four-year period. In this case the progress is not quite a smooth and steady advance, as often totally new technologies had to be introduced to maintain the expansion in signal capacity. It is interesting to see that the graphical plots of transmitted data rates also historically flow smoothly back to the nineteenth century, where Morse code telegraph or heliograph light pulses were used. The only (!) difference is that the transmission capacity has improved by a million million times.

When we consider the case of the survival of information written into different media, there are certainly trends running all the way from carving stone to writing on a computer. Finding an accurate predictive law is highly unlikely, as there are too many differences between the various materials, but a pattern does definitely exist. In my model, the trend I see is the following: The rate at which information is written and stored times the survival half-life is roughly constant (S). Imagine S as a measure of information survival.

Putting in numbers to test this model is certainly going to be contentious and a very personal matter. For instance, I can suggest that with a twentieth-century electric typewriter, a really good typist might have managed 80 pages per day for 250 days per year (i.e. 20,000 pages per year), and this text survived, say, 40 years before the paper and print faded away or were lost in filing systems and office renovations. Ignoring the funny units of ‘pages’ that I am using, this gives an S-value of 800,000. A comparison can be made with an excellent and prolific scribe who in 1086 produced the final version of the 900-page Domesday Book in about a year. The book has been preserved for some 900 years, which gives him an S-value of 810,000. The match may be deliberately biased, but broadly there is agreement in the effective range of S-values that emerge in such estimates.

One can take many other examples, such as the Rosetta Stone. There we can guess at the time taken for the carving, recognize it is far from complete as sections are missing, but the fragment is still in good condition. The arithmetic offers a comparable value of S. Similar numbers emerge for cuneiform written clay tablets.

For amusement, I tried using this S-value to predict the survival time of a typical doctoral thesis for a physicist, who has done computer calculations, and included computer graphics (i.e. where the computer power is equivalent to many years of manual calculation and plotting). The answer was a pitiful few months. However this is actually sensible! After submitting the thesis, there is a face-to-face examination (the viva voce) within a month or two. The best bits of the thesis are published, but the bound thesis volume then sits in a library, and may never be read again (or perhaps may be eaten by mice).

The modern phase in this trend is where powerful computers and Internet communication are transmitting immense quantities of information on short timescales. The implication is a very short survival lifetime for most of the transmissions. In fact, this is certainly true as most emails will only be read once. Hundreds are scams and junk mails that are immediately binned without being read, plus there are millions of blogs and tweets which litter the Internet, but which will rapidly fade into obscurity. In the latter case, Internet information loss is therefore not a totally negative feature. The real downside of the huge traffic flow is that we are simultaneously burying and losing data that is of value, and doing so ever more rapidly than in the past, when we were using other communication methods.

The very clear and crucial observation is that there is a trend that unequivocally demonstrates that moving to higher-speed writing and calculation techniques can generate information at a faster rate, but the material will survive for progressively shorter times before being lost or superseded. Further, the same pattern exists across a wide spectrum, from writing, computations, and mechanical equipment to, as I will now show, other types of information.

Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >

Related topics