wikiacademia

site

Deterministic Finite Automata

Machines

A Finite Automaton is a machine comprised of a finite number of states, with transitions between the states based on input characters. Because the machine has a finite number of states, and stores no other data, it only has finite memory. A string is accepted if and only if, at the end of the input string, the state the machine is in is an accept state.
A deterministic finite automaton, or DFA, makes one transition per character in the input, and contains a transition from each state for each character in the alphabet.

DFAs & NFAs

A non-deterministic finite automaton, or NFA, can consume either one or no characters per transition, does not need a transition for each character from each state, and can have more than one possible transition for a specific input and a specific state. These properties allow the NFA to contain computation branches. If any computation branch generated by a string accepts, the machine accepts that string, and only rejects if no branch accepts. Since DFAs are more restrictive than NFAs, every DFA is also an NFA, and any NFA can be converted into a DFA, so the set of languages recognized by all NFAs and DFAs is the same, and the two machines are equivalent.

Languages

The class of regular languages is those languages which can be described by characters from the alphabet with concatenation, parentheses, the &cup (or) operator, and the * operator. An example would be {(b∪(ab)) * } , the language containing any number of as and bs, where every a is followed by a b. The class of regular languages is equivalent to the class of languages recognized by a DFA.

Examples

Sample DFA inputs

A sample DFA emulator in perl. The syntax is: perl progname.pl DFAFile inputFile, where DFAFile is a text file containing the DFA instructions, and inputFile is a text file containing the input string.
use Text::ParseWords; use strict; use warnings; our (%states, %accepts, %alphabet, @rules, @string); # Grabs the filenames for the machine and the word to be run on it. my $dfaFile = $ARGV[0]; my $input = $ARGV[1]; # Uses subroutines to parse and verify the data in the input files. # The data is stored ih the global hashes and arrays, with the exception of the start state, which is provided here. my $state = readDFA($dfaFile); readInput($input); # The newstate is a temporary storage point for when a new state is calculated. my $newstate; # For each symbol in the input word, the next state is calculated. for (0..@string-1) { # The input word is printed, with the next symbol highlighted. print "@string[0..$_-1] <".$string[$_]."> @string[$_+1..@string-1]\n"; # A new state is calculated. $newstate = $rules[$states{$state}][$alphabet{$string[$_]}]; # The state transition is printed. print "State: ".$state." -> ".$newstate."\n\n"; # The state changes to the new state. $state = $newstate; } # When the input is exhausted, the machine is in its final state. print "Final state is ".$state."\n"; # If that state is in the accept states list, the machine accepts the input. if (defined($accepts{$state})) { print "The machine accepts the string.\n"; } # If not, the machine rejects. else { print "The machine does not accept the string.\n" } ################################################### sub readDFA # This subroutine reads the machine data from the specified file into variables (mostly hashes). { open(INFILE, shift) or die "Can't open ".$dfaFile.": $!"; # This block reads the list of states from the machine file. # Discards the section header, <INFILE>; my $line = <INFILE>; chomp($line); my @words = &parse_line('\s+', 0, $line); my $counter = 0; for (@words) { # assigns each state name a number by creating a hash, $states{$_} = $counter; $counter++; } # and records the number of states. my $stateCount = $counter; # This block reads the start state from the machine file. # Discards the header, <INFILE>; my $startState = <INFILE>; # takes the whole line as the start state, chomp($startState); # and makes sure that the start state is defined in the list of states. defined($states{$startState}) or die "The start state $startState isn't a state!"; # This block reads the list of accepting states from the machine file. # Discards the header, <INFILE>; $line = <INFILE>; chomp($line); # breaks up the line into state names, @words = &parse_line('\s+', 0, $line); for (@words) { # checks to make sure that the accept states are defined states, defined($states{$_}) or die "The start state $_ isn't a state!"; # and defines those names in a new hash. The use of a hash makes it easier to determine later if a specific state name accepts or not. $accepts{$_} = 1; } # This block reads the list of symbols in the alphabet from the machine file. # Discards the header, <INFILE>; $line = <INFILE>; chomp($line); # breaks up the line into alphabet symbols (note that the symbols can be of arbitrary length), @words = &parse_line('\s+', 0, $line); $counter = 0; for (@words) { # assigns each symbol a number, and creates a hash to reference them, $alphabet{$_} = $counter; $counter++; } # and then records the number of symbols in the alphabet. my $alphabetCount = $counter; # This block reads the state transition rules from the machine file. # Discards the header, <INFILE>; while(<INFILE>) { # breaks each rule into start state, input symbol, and end state, @words = &parse_line('\s+', 0, $_); # checks that all three pieces are defined in the state and alphabet hashes, defined($alphabet{$words[1]}) or die "$words[1] isn't defined in the alphabet!"; defined($states{$words[0]}) or die "$words[0] isn't a defined state!"; defined($states{$words[2]}) or die "$words[2] isn't a defined state!"; # then uses the numbers assigned to the start state and the input symbol as indexes in a 2-dimensional array whose value is the name of the end state. $rules[$states{$words[0]}][$alphabet{$words[1]}] = $words[2]; } # This last block makes sure the set of rules is comprehensive. for my $i (0..$stateCount-1) { defined($rules[$i]) or die "Rules are incomplete."; for my $j (0..$alphabetCount-1) { defined($rules[$i][$j]) or die "Rules are incomplete."; } } # Reading complete, the subroutine closes the file and returns the name of the start state. close INFILE; return $startState; } sub readInput # This subroutine reads the input string from the specified file into an array of symbols. { open(INFILE, shift) or die "Can't open ".$input.": $!"; # The first line of the file is read as the input, with symbols delimited by spaces. my $line = <INFILE>.""; chomp($line); @string = &parse_line('\s+', 0, $line); # This makes sure every symbol in the input string was defined in the machine's alphabet. for (@string) { defined($alphabet{$_}) or die "$_ in $input isn't in this machine's alphabet!"; } close INFILE; }