Turing and Turing Machines

Turing Background

Born and lived in England (1912-1954)
Wrote his paper defining Turing Machines in 1936
Worked on Enigma codebreaking project during the war
Persecuted for his homosexuality in the 1950s, he died in 1954 by poisoning in what appears to be either an accident or a suicide.

The Turing Machine

Remember Hilbert's 23 problems raised in 1900. One problem, the 10th, asked if there could be an effective solution to a certain kind of mathematical problem. This requires in fact a clear explanation of what an effective procedure is. Turing's abstract machine provides one candidate.
Informal specification: a Turing Machine is a tape head that moves along a ticker tape, reading and writing as it goes. The ticker tape is linear and divided into squares. There is a finite alphabet of what is written on squares; 1 and 0 are enough. The tape head has a finite number of internal states that tell it what to do. For example, a rule might say, if in state 3 and reading a 1 on the tape, write a 0 over it an go into state 4. At least one state is the start state, and at least one is a halt state (which means, I'm done). We interpret the string on the tape in some way, and the output is what is left on the tape after the machine halts.
Formal specification (a la Sipser 1992: 127-128): a Turing machine is a function from one string (the starting state of the tape) to another (the state of the tape after running). This is easier to describe if we assume the tape has blank symbols which are technically not part of the alphabet. Also, it can be handy to call one state the accept state, and another the reject state; this is unnecessary but makes a lot of the computer theory go smoothly. A turing machine is now an ordered set with seven elements: (S, A, L, d, s₀, s_a, s_r). S, A, and L are finites sets. Thus:

S is the set of machine states.
A is the input alphabet (not containing the special blank symbol)
L is the tape alphabet (generally, this is A with the special blank symbol)
d is a function from S x L into S x L x {left, right} (this is the transition function, and just says, for every state, and every input, there is an output of a state, a letter, and either a move left or right)
s₀ is a member of S and is the start state
s_a is a member of S and is the accept state
s_r is a member of S and is the reject state
It is more common to have not an accept and reject state, but rather a single "halt" state, S_H. Accept and reject can be expressed by what you write on the tape.

The Church-Turing Thesis

Alonzo Church developed a tool for describing logical procedures called the lambda calculus. This is basically a tool for doing recursion.
It has been shown that the lambda calculus is equivalent to the Turing Machine in abilities (although, arguably, far less intuitive!).
An informal version of the Church-Turing thesis is: the notion of effective procedure (or algorithm) is exactly what Turing Machines (or the lambda calculus) can do.
This cannot be proven, since the notion of effective procedure is not well defined. But, if we ever found a procedure which we agreed was effective, but which a computer couldn't do, we could then disprove the Church-Turing thesis. This has never happened.
The Universal Turing Machine

The Church-Turing thesis as formulated above refers to Turing machines. But it turns out that we can create a general Turing machine which alone can do what any other Turing machine can do. Call such a machine a Universal Turing Machine or UTM.
A UTM receives as input on its tape a string which now is not just data, but is a program and data. The program tells the machine what to do with its data. Think of the tape as having a long input string, which is really the program string followed by the data string, with some way to tell where the program ends and the data begins (for simplicity, let's assume we have a special symbol "*" which we use just for this purpose; thus p*d is some program followed by "*" followed by some data). The UTM plus the program together now determine what is to be done with the data.
Note that for a particular UTM, each program and data combination is now a particular string. Suppose my alphabet is just 0 and 1. We can even think of a program or a program plus some data as a particular binary string. In turn, we can think of every binary string as some particular program or some program + data combination. (Most of these programs are junk and will crash!)
We will use the following shorthand: suppose that U is a universal Turing machine, p is a program for it, and d is some data for that program. U(p*d) means that we feed the program and data to the machine.
Making a UTM is beyond our scope here, but it is essential that the idea be clear. This is because a particular UTM will provide a common measure for things we later want to do, and from now on we will think of programs or programs plus some data as strings of 1s and 0s.

The Halting Problem

A Turing Machine halts when it finds an answer (it may also halt because it has been given an undefined input). It is easy to see that we could write a turing machine that never halts (e.g., imagine a 1, 0 alphabet, and one state S0: when reading 1 or reading 0, go right and stay in state S0).
If a Turing machine is working on a problem, suppose it takes a long time. We might wonder, is this thing going to find an answer to the problem I gave it, or is it rather going to continue forever? Note that continuing forever means that this machine cannot solve this problem. So, we might find that every machine we make to solve some particular problem is continuing, continuing, continuing, and we want to know, is this because this problem is unsolvable (for the machines we have made, at least), or is it just taking a long time?
In the best of all worlds, we would have our universal Turing Machine, and then we could write a program that checked any arbitrary program for our UTM and some data set, and told us either "this program and this data will halt" or "this program and this data will not halt." Call such a program a Halt-Check.
A Halt-Check is going to be a UTM plus a program, and the data it takes is going to be the combination of some UTM program plus its data. So, if we use our convention made above; and we assume H is the Halt-Check program, U our UTM, p our program to be checked, and d the data it will consider, a situation where we check p with H on d is written U(H*p*d).
There are two arguments that we can use to show that there can be no Halt-Check program. Both are reductio ad absurdum arguments, where we assumer that there is a Halt-Check program and we derive a contradiction.
1. Diagonal Version. Fix some UTM with a binary alphabet of 1, 0. We claim the following:

(a) We can effectively list all the computer programs (which for us is programs plus data).
(b) We can effectively get the output of each program.
(c) We can interpret the output of some of these programs as producing a real number with its decimal extension.
It is trivial to show (a): I just list every string in my alphabet of 1, 0. That is, I list every binary string in turn in some specific order. For (b), I'll just run the programs. And for (c), whatever interpretation I use of the outputs I'll apply this and then skip over those programs that don't have an interpretable output.

We are interested in the question of whether these computers will print out every particular real number with some decimal extension. Many programs will not compute (write on their tape) a real, but we will just ignore these cases (this is our claims (c)). We list then all the programs and their real number output in a table. P1 means program 1, and to the right is the output. We leave a few blanks to indicate there are machines with no real number output. Here I've just made up numbers, but understand again that these are just arbitrary stand-ins for some unspecified but assumed output.
P1 --> 2.67282....
P8 --> 6.42739....
P95 --> 7.23494....
P121 --> 0.00001....
....

You can already see how this is going to go! We can now produce the real number, call it D, that differs on the diagonal of the decimal: 0.7351.... This number is not computed by any machine in the list, but:

(d) D is not on the list.
(e) it is simple to see how to program a machine that creates this number D! There just needs to be a program that outputs each string in turn, feeds it to U as a program, and then takes the output (if there is one) and if this is the nth program that outputs a real it adds 1 to the nth decimal.

But by these two observations we have that D is and is not on the list! So, what has gone wrong? Let's review the claims.

(a) is obviously true: I can enumerate all programs just by listing all strings in my alphabet (0, 1, 00, 01, 10, 11, 000....). (c) looks simple: I just list every output and follow my interpretation rule; the rule allows that some outputs are not a real number (such as no output) and I skip over these. (d) is shown by the argument above. (e) looks certainly true also -- I described quite clearly how to make D.

What about (b)? We thought that was simple: I just run each program, and surely each program is an effective procedure so when I get the output there is an effective procedure to get that output. But note now that if I run these programs, I have assumed that each one will stop! If it's running for 2 years, do I keep waiting? What if it is caught in an infinite loop? Well, it would seem that what I need is an effective procedure to check my programs before I run them to see if they'll give me an output -- that is just what the Halt-Check program is supposed to be. But if I have such a program, I have (b), and I can deduce my contradiction. So it must be that I can't have such a program, and (b) is therefore false.

Hence, there is no effective procedure to tell me which programs will give us an output. There is no effective procedure, therefore, to tell us what all the effective procedures are.

2. Impossible Machine Version. Assumption for reductio: assume that there is a program for our UTM that can determine for any arbitrary program+data combination whether it will halt.

Call this program H for Halt-Check. H takes as its input a string p*d which is just some program followed by its data. We can write U(H*p*d) to say the input to U is H and the concatenation of p and d with the conventional marker for their separation.
Now, from this machine it is trivial to create the following machine C (for Crazy-Machine): C consists of this machine U and H plus just a little bit of extra code: if H says that p*d will halt, loop forever; if H says that p*d will never halt, then halt.
Make just one more modification to create the machine R (for Really Crazy): R takes some input I and just concatenates it (by whatever method we are using to put program and data together) and feeds it to C. So, if you put input I into R it feeds (I*I) to C and we have C(I*I).
Now, feed R to R. What happens? If R(R) halts, then R(R) loops forever; but then R(R) loops, which means that R(R) halts.
The machine R is impossible. Given H, the creation of C and of R was trivial, so something else must have gone wrong: namely, it must be the assumption that there is a program H.

This result is shocking and very powerful. It means that we cannot be sure for any arbitrary well-defined question and procedure to answer it whether that procedure will ever give us an answer. There is no way now to realize Hilbert's dream: we have agreed on what an -- or, at least, what one kind of -- effective procedure is, but we cannot be sure it will ever yield an answer in any arbitrary case. Since there is no effective way to determine what the effective procedures are (the good computer programs), it appears reason cannot discover the limits of reason. (A philosophical consequence of interest is also that this is a nicely clear example of how a fully deterministic system can be in principle unpredictable!)

A Nice Clarifying Case of Deterministic but Unpredictable: Conway's Life

Background: Conway's Life.
A simple "world" that is a grid where each square has a basic primitive state: on or off.

There are the following simple rules:

For "on" squares: if it has 2 or 3 neighbors it stays on, else it goes off.
For "off" squares: if it has 3 neighbors, it turns on; else it stays off.
Here is a nice java app

This simple world has some extraordinary features.

Question: could the real world be like the Life World? That is, could it be following simple computational rules that when seen from a "vast height," appear extremely complex?
Other world rules have been tested. It appears that most devolve into emptiness or a crowded world. Only some develop "interesting" structures.
The Life World has been shown to be able to sustain a Turing Machine. Two people claiming to have built them include http://rendell-attic.org/gol/tm.htm and http://www.igblan.free-online.co.uk/igblan/ca/. Here is a video of a UTM in the life world.
If the Life World can have Turing equivalent structures, then these have all the limitations that exist for computation! (E.g., on predictability.)
A Philosophical Aside: are gliders objects?

Some references:

Hodges, Andrew (1992) Alan Turing: The Enigma. [Nice biography of the extraordinary man. Highly recommended. Lots of great stuff is also on Hodge's Turing www site.]

Sipers, Michael (1992) Introduction to the Theory of Computation. [A very fine introduction to machine theory, although the first edition to which I refer to above is drowning is typos and slighly more serious errors. Sipser caught these and posted them to his www site, but if a later edition comes available that will be a great buy.]