[Revised 12 April 2004]
Kolmogorov Complexity
- Two Conventions: from now on, we will assume
- We have a shared universal Turing machine that
has a binary alphabet (1, 0) and which we use for all
inputs (programs plus data). (This gives us a common
reference point.)
- All inputs and outputs are binary strings.
(Other alphabets are convertible to binary strings, so
this costs us nothing and makes the math easier.)
- The Kolmogorov Complexity of a description or
object is the size of the smallest program that can reliably print
or recognize
that description. We can measure this in bits, if we keep to
the conventions above.
- Shouldn't this be relative to the programming language
(or, the universal Turing machine) I use? Yes, but only to a
degree bounded by some finite constant!
- We can think of Kolmogorov Complexity as measuring also
- Randomness
- Compressibility
- Information
Kolmogorov and Randomness
- Here are three strings. Two are real coin tosses by your
teams, and one I made up. Ask yourself two questions: (1) for
each string, what are the odds that you tossed that string?
(2) can you guess which I made up?
A: 0 1 1 0 0 1 1 1 0 1 1 0 1 0 1 0 1 0 1 0
B: 1 0 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1 0
C: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
- Standard measures of randomness tell us how likely
something is as an instance in a (hopefully very large)
population of instances.
- But some individual objects (such as the strings above)
appear to be, all by themselves, more or less random. And
yet, each of these strings would appear to have the same a 1
in 220 chance of occurring in a truly random coin
toss of 20 cases -- so how can that be?
- Kolmogorov complexity allows us to describe the fact that
some strings are more regular in some sense than are others.
This can articulate the idea of a single random instance:
strings that have more information are closer to one kind
of intuitive notion of "random." A string with maximal
Kolmogorov complexity we will call "Chaitin-random." Chaitin
Random strings are incompressible: you cannot make them smaller
without throwing away information.
- Why call this property "random"? Here's one way to
understand how this may be akin to the usual notion of
randomness. We look at the probability of generating a
program that could print the string in question.
- Assume we have 2 strings, A and B, each n bits
long. String A is Chaitin random, and so has
Kolmogorov Complexity n. String B has Kolmogorov
complexity k, where k < n. (Typically, we assume that
these strings are very long, in order to gloss over
the implementation costs).
- Assume we are feeding random programs to
our UTM (these programs are generated by coin tosses).
- We shall have to flip our coin at least n times
to generate a program that can print string A of
Kolmogorov complexity n. We have then roughly a
2-n chance of generating A.
- But we may be able to generate a program k bits
long that prints B. So, we have roughly a 2-k
chance of generating B.
- But then B is 2n-k times more likely to
occur!
- Note that there are Chaitin Random strings of every size,
for any formal language. For
example, there are 2n binary strings of length n. But
there are only 2n-1 shorter strings! So, there are fewer
shorter descriptions, and so some strings (in fact, at least about
half of them!) must be random.
Kolmogorov and Compressibility
- Kolmogorov Complexity tells us how compressible a string
is. The higher the Kolmogorov complexity, the less we can squeeze
it.
- Note that a Chaitin Random string is incompressible.
There is no pattern in the string that can be described more
briefly that just copying the string itself. (Chaitin
sometimes calls such strings patternless. We might say
instead that they have no pattern other than what they are.)
Kolmogorov and Information
A string with high Kolmogorov complexity has more information:
- A long string with low Kolmogorov complexity can be
described by a smaller string -- therefore it could be
replaced with a smaller string and we would still have the
same information.
- A Chaitin-random string cannot be compressed, and so it has
maximal information. If we want the information in the string,
we need the whole string.