Turing Machines
Machines
A Turing Machine is identical to an LBA, with one exception: the tape is infinite. In its standard form, the Turing Machine's tape has a left endpoint but continues indefinately to the right. Other infinite models for the tape, such as a tape infinite in both directions, or multiple infinite tapes, are equivalent to the standard form.
The Turing Machine is the most powerful machine in the hierarchy, with the power to emulate any of the other machines. It is equivalent in power to most programming languages, although computers only contain finite memory, while a true Turing Machine has infinite memory.
Universal Turing Machine
Turing Machines can also be programmed as something called a Universal Turing Machine, a single TM (meaning a single set of states, rules, and alphabet) that can accept as input another TM encoded as a string, together with an input string. The Universal Turing Machine can then emulate the other TM running the input string. This universal emulation of its own class is a property not shared by any other machine in the hierarchy. Some of them can be programmed to emulate a specific subset of their class, but none of them can emulate any given member of their class.
Looping
However powerful they may be, they do have limitations. One of the most obvious limitations is that unlike LBAs, a TM has an infinite number of conditions, due to the infinite tape. This means that the TM can not only loop, it can be in an infinite running, non-halting pattern that never loops. An example of this is if a TM was programmed to calculate and enter the digits of pi on the tape, and to halt when finished. The machine would continue to run forever, always adding new data to the tape and never looping, but simply running off on an infinite task.
Turing Recognizability
Those languages for which there is a Turing Machine that will always halt and accept in a finite amount of time for any string in the language are called Turing Recognizable languages. Note that it need not be possible to make a machine that halts and rejects all strings not in the language - the machine may loop. Continuing the previous example, the language composed of all {strings giving instructions (with some specific encoding) on how to produce and print as a decimal a number with a finite number of decimal places} is Turing Recognizable. If a machine is given a correct input, it can follow the instructions, print the number, finish, and accept. If it is given instructions that don't make sense, it can reject. But if it is given instructions that look right, and follows them, it will run forever.
Decidability
There are some languages for which a Turing Machine can be written that will halt on all input, either to accept or reject. These languages are called Decidable languages, and TMs that always halt on any input are called Deciders. One example of a decidable language is {stringified LBA with input string that halts on that input}. A TM can be written to emulate an LBA, recoring the number of steps it makes, and accept if it accepts, reject if it rejects, and reject if it exceeds a certain number of steps (what that number of steps is and how it guarantees that the LBA will loop forever are explained in the section on LBAs). All Decidable languages are also by definition Turing Recognizable.
The Halting Problem
The most important limitation of the TM is known as the halting problem. A UTM cannot always accurately predict whether a given TM will halt or run forever on a given input. Clever programming of the UTM can sometimes predict the answer, but the only solution for all problems is to emulate the machine on the input. Unfortunately, if the emulated machine runs forever, so will the UTM, and will never halt and reject. It has been proven that no TM can be written that can decide the halting problem, so the language of {stringified TMs with an input string that halts on that input} is a Turing Recognizable language, but is not Decidable.
Non-Recognizeability
Not all languages can even be recognized. Take, for example, the inverse of the halting problem, the language {stringified TMs with an input string that does not halt on that input}. If that language was Turing Recognizable by some TM 'S', then a UTM 'T' could be designed to emulate S, and alternatingly to emulate the TM being fed to S. Eventually, one of the two will halt, and thus T will be a decider for the halting problem, which we know doesn't exist, so the language is not Turing Recognizable. Furthermore, since every Turing Machine can be encoded as a unique string under some encoding, and there are a countably infinite number of strings, but an uncountably infinite number of languages, there must be an uncountably infinite number of languages to which there is no corresponding TM, and thus are not Turing Recognizable.
Languages
The set of languages recognized by Turing Machines corresponds to the set of languages generated by unrestricted grammars. Unrestricted grammars are grammars composed of a finite number of rules of the form A -> B, where A and B are strings of terminal and non-terminal symbols, and A is not the empty string. As the name implies, this type of grammar has no restrictions other than those provided in Chomsky's definition of a grammar. The languages prodeced by these grammars are called Recursively Enumerable, as a recursive function could theoretically generate all the strings in them, although not necessarily in a finite time.
Examples
Sample TM inputs
A sample TM emulator in Perl. The syntax is: perl progname.pl TMFile inputFile, where TMFile is a text file containing the TM instructions, and inputFile is a text file containing the input string.
use Text::ParseWords;
use strict;
use warnings;
my (@tape, $tapeIndex, %accepts, %alphabet, @rules);
# Grabs the filenames for the machine and the word to be run on it.
my $tmFile = $ARGV[0];
my $input = $ARGV[1];
# We use subroutines to parse and verify the data in the input files.
# The machine data is stored in the $machine structure as the keys rules, accepts, alphabet, and startState.
my $machine = readTM($tmFile);
# Rules and accepts are extracted from the $machine structure for ease of access.
@rules = @{$machine->{rules}};
%accepts = %{$machine->{accepts}};
# This reads the input file and parses it into an array of strings, with each element being one input symbol.
# It checks to make sure the elements are all in the machine's alphabet.
@tape = readInput($input, $machine->{alphabet});
# $changed records whether or not a rule has been used when running through the rules list to make transitions.
my $changed = 1;
# $state is the state the Turing Machine is in, and is initialized to the start state from the machine file.
my $state = $machine->{startState};
# $tapeIndex is the position of the machine's head on the tape.
$tapeIndex = 0;
# Now that the machine is initialized, we can begin making transitions
# As long as things keep changing, keep cycling through the rules.
while($changed)
{
# Unless it changes while going through the rules, the machine will terminate.
$changed = 0;
# The current tape is printed, with the symbol under the head highlighted.
print "@tape[0..$tapeIndex-1]<".$tape[$tapeIndex].">@tape[$tapeIndex+1..@tape-1]\n";
# The current state of the machine is printed.
print "$state\n";
# A new state is calculated by checking conditions against the list of rules
for my $rNum (0..@rules-1)
{
# print "::$rules[$rNum][0]??$branches[$i][0]";
# print "::$rules[$rNum][1]??$string[$branches[$i][1]]";
# print "::$rules[$rNum][2]??".${$branches[$i][2]}[@{$branches[$i][2]}-1]."::\n";
# Checks the current state and tape symbol against the rule being examined
if (($rules[$rNum][0] eq $state) and
($rules[$rNum][1] eq $tape[$tapeIndex]))
{
# The state transition is printed.
# print "State: ".$state." -> ".$rules[$rNum][2]."\n\n";
# Set the new state,
$state = $rules[$rNum][2];
# Write the new symbol to the tape,
$tape[$tapeIndex] = $rules[$rNum][3];
# Shift the tape to the new index,
$tapeIndex += $rules[$rNum][4];
# and make sure it hasn't run past the left edge of the tape.
if ($tapeIndex < 0) { $tapeIndex = 0; }
# If the machine nears the end of the allocated tape, expand the tape.
if ($tapeIndex >= @tape-2) { push(@tape, "_"); }
$changed = 1;
# Once we've made a transition, we can stop and begin looking for the next one.
last;
}
}
}
# When there are no more possible transitions, if the machine is in an accepting state,
if (defined($accepts{$state}))
{
# Print that it accepts and quit.
print "The machine accepts the string.\n";
exit;
}
# Otherwise, print that it does not accept, and quit.
print "The machine does not accept the string.\n";
exit;
###################################################
sub readTM
# This subroutine reads the machine data from the specified file into variables (mostly hashes).
{
my (%states, %accepts, %alphabet, @rules);
open(INFILE, shift) or die "Can't open machine file: $!";
# This block reads the list of states from the machine file.
# Discards the section header,
<INFILE>;
my $line = <INFILE>;
chomp($line);
my @words = &parse_line('\s+', 0, $line);
for (@words)
{
# records the state names for checking the rules,
$states{$_} = 0;
}
# This block reads the start state from the machine file.
# Discards the header,
<INFILE>;
my $startState = <INFILE>;
# takes the whole line as the start state,
chomp($startState);
# and makes sure that the start state is defined in the list of states.
defined($states{$startState}) or die "The start state $startState isn't a state!";
# This block reads the list of accepting states from the machine file.
# Discards the header,
<INFILE>;
$line = <INFILE>;
chomp($line);
# breaks up the line into state names,
@words = &parse_line('\s+', 0, $line);
for (@words)
{
# checks to make sure that the accept states are defined states,
defined($states{$_}) or die "$_ isn't a state!";
# and defines those names in a new hash. The use of a hash makes it easier to determine later if a specific state name accepts or not.
$accepts{$_} = 1;
}
# This block reads the list of symbols in the alphabet from the machine file.
# Discards the header,
<INFILE>;
$line = <INFILE>;
chomp($line);
# breaks up the line into alphabet symbols (note that the symbols can be of arbitrary length),
@words = &parse_line('\s+', 0, $line);
# e is used as the empty symbol in the rules.
$alphabet{e} = 1;
for (@words)
{
# This records which symbols are in the alphabet for checking the rules.
$alphabet{$_} = 0;
}
# This block reads the state transition rules from the machine file.
# Discards the header,
<INFILE>;
# This variable synchronizes the position of each rule in the rules array.
my $rulesCounter=0;
while(<INFILE>)
{
# breaks each rule into start state, input symbol, stack symbol, end state, and new stack symbol.
chomp($_);
@words = &parse_line('\s+', 0, $_);
# checks that the first four pieces are defined in the state and alphabet hashes,
defined($states{$words[0]}) or die "$words[0] isn't a defined state!";
defined($alphabet{$words[1]}) or die "$words[1] isn't defined in the alphabet!";
defined($states{$words[2]}) or die "$words[2] isn't a defined state!";
defined($alphabet{$words[3]}) or die "$words[3] isn't defined in the alphabet!";
# and converts the left/right instruction into a number to be added to the position counter.
if (($words[4] eq "left") or ($words[4] eq "-1"))
{
$words[4]=-1;
}
elsif (($words[4] eq "right") or ($words[4] eq "+1"))
{
$words[4]=1;
}
else
{
(die "$words[4] isn't left, right, -1, or +1!");
}
# then creates an array of each rule.
for (0..4)
{
$rules[$rulesCounter][$_] = $words[$_];
}
# The synchronization variable has to be updated.
$rulesCounter++;
}
# Reading complete, the subroutine closes the file and returns the name of the start state.
close INFILE;
# The relevant data is stored in the $machine structure and returned to the main routine.
my $machine =
{
rules => \@rules,
accepts => \%accepts,
alphabet => \%alphabet,
startState => $startState
};
return $machine;
}
sub readInput
# This subroutine reads the starting tape from the specified file into an array of symbols.
{
my (@tape, %alphabet, $alphaRef);
open(INFILE, shift) or die "Can't open ".$input.": $!";
$alphaRef = shift;
%alphabet = %{$alphaRef};
# The first line of the file is read as the initial state of the tape, with symbols delimited by spaces.
my $line = <INFILE>."";
chomp($line);
@tape = &parse_line('\s+', 0, $line);
# This makes sure every symbol in the input string was defined in the machine's alphabet.
for (@tape)
{ defined($alphabet{$_}) or die "$_ in $input isn't in this machine's alphabet!"; }
close INFILE;
return @tape;
}