Kolmogorov Complexity I

[Revised 12 April 2004]

Kolmogorov Complexity

Two Conventions: from now on, we will assume

We have a shared universal Turing machine that has a binary alphabet (1, 0) and which we use for all inputs (programs plus data). (This gives us a common reference point.)
All inputs and outputs are binary strings. (Other alphabets are convertible to binary strings, so this costs us nothing and makes the math easier.)

The Kolmogorov Complexity of a description or object is the size of the smallest program that can reliably print or recognize that description. We can measure this in bits, if we keep to the conventions above.
Shouldn't this be relative to the programming language (or, the universal Turing machine) I use? Yes, but only to a degree bounded by some finite constant!
We can think of Kolmogorov Complexity as measuring also

Randomness
Compressibility
Information

Kolmogorov and Randomness

Here are three strings. Two are real coin tosses by your teams, and one I made up. Ask yourself two questions: (1) for each string, what are the odds that you tossed that string? (2) can you guess which I made up?
A: 0 1 1 0 0 1 1 1 0 1 1 0 1 0 1 0 1 0 1 0
B: 1 0 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1 0
C: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

Standard measures of randomness tell us how likely something is as an instance in a (hopefully very large) population of instances.
But some individual objects (such as the strings above) appear to be, all by themselves, more or less random. And yet, each of these strings would appear to have the same a 1 in 2²⁰ chance of occurring in a truly random coin toss of 20 cases -- so how can that be?
Kolmogorov complexity allows us to describe the fact that some strings are more regular in some sense than are others. This can articulate the idea of a single random instance: strings that have more information are closer to one kind of intuitive notion of "random." A string with maximal Kolmogorov complexity we will call "Chaitin-random." Chaitin Random strings are incompressible: you cannot make them smaller without throwing away information.
Why call this property "random"? Here's one way to understand how this may be akin to the usual notion of randomness. We look at the probability of generating a program that could print the string in question.

Assume we have 2 strings, A and B, each n bits long. String A is Chaitin random, and so has Kolmogorov Complexity n. String B has Kolmogorov complexity k, where k < n. (Typically, we assume that these strings are very long, in order to gloss over the implementation costs).
Assume we are feeding random programs to our UTM (these programs are generated by coin tosses).
We shall have to flip our coin at least n times to generate a program that can print string A of Kolmogorov complexity n. We have then roughly a 2^-n chance of generating A.
But we may be able to generate a program k bits long that prints B. So, we have roughly a 2^-k chance of generating B.
But then B is 2^n-k times more likely to occur!

Note that there are Chaitin Random strings of every size, for any formal language. For example, there are 2ⁿ binary strings of length n. But there are only 2^n-1 shorter strings! So, there are fewer shorter descriptions, and so some strings (in fact, at least about half of them!) must be random.
Kolmogorov and Compressibility

Kolmogorov Complexity tells us how compressible a string is. The higher the Kolmogorov complexity, the less we can squeeze it.
Note that a Chaitin Random string is incompressible. There is no pattern in the string that can be described more briefly that just copying the string itself. (Chaitin sometimes calls such strings patternless. We might say instead that they have no pattern other than what they are.)
Kolmogorov and Information
A string with high Kolmogorov complexity has more information:

A long string with low Kolmogorov complexity can be described by a smaller string -- therefore it could be replaced with a smaller string and we would still have the same information.
A Chaitin-random string cannot be compressed, and so it has maximal information. If we want the information in the string, we need the whole string.