Download the program and source code here

A Turing Machine in C++


I think about Turing Machines a lot. Their implications on large numbers, computability, and prove-ability are fascinating, and speak to the very foundations of mathematics. And more practically, they are a statement on how computation is possible through ultra simple machinery and on the power in the separation of hardware and software.

So to learn more about them, and to keep my coding skills up, and for plain ol’ fun, I decided to write a Turing Machine program. Like Turing Machines themselves, my program is pretty simple - it just reads a “.tm” script, compiles, and runs it.

Here is an example .tm script. As shown, it both defines the program and provides the initial tape input, using the “input” keyword.

	input   000011010101

	State0    1    =      >     =         
	  "       0    1      >     =         
	  "      \0    =      =     Halt


This code takes a string of 1s and 0s, and replaces all the 0s with 1s. So the specified input, 000011010101, would be overwritten as 111111111111. It’s a simple script, though without knowing the syntax, it probably just looks like gibberish. I’ll see if I can explain it a bit, but if you want an in-depth description, I’ve also included the full documentation at the bottom.

It’s worth noting first that the above script can be re-formated like so:


	input   000011010101

	|----+-----------+---------+-------+------+-----------|
	| // | CurrState | Trigger | Write | Move | NextState |
	|----+-----------+---------+-------+------+-----------|
	|    | State0    |       1 | =     | >    | =         |
	|    | "         |       0 | 1     | >    | =         |
	|    | "         |      \0 | =     | =    | Halt      |
	|----+-----------+---------+-------+------+-----------|

This is the same program as above, but with table formatting (via Emacs and org-mode) for easier readability. My Turing Machine program knows to ignore the |, -, + symbols, and to treat everything after any // as a comment, just like in C++.

Ok, with that, let me describe the syntax. For brevity, I’m assuming you already know how a Turing Machine works in concept (and if you don’t, there are plenty of great resources on youtube and elsewhere. One of my favorites is this video by Art of the Problem, How Turing Machines Work).

  • CurrState – is the current state of the turing machine. Specifying " (double quote) for a state name means, “use the previously listed state name here”, so in the example above, every row is describing “State0”. Also, the first state listed is automatically considered the Turing Machine’s initial state - ie, the program entry point.

  • Trigger – is the symbol to try to detect under the Turing Machine’s read/write head, and for which, if found, the associated Write, Move, and NextState instructions are executed. For Turing Machines with large alphabets, it’s useful too to have a default trigger, which is akin to the “default” keyword in a C++ switch block.

  • Write – is the symbol to write to the tape if the Trigger symbol was read (and the Turing Machine is in the matching state). If = (equal) is specified, then don’t write anything at all.

  • Move – is the direction to move the read/write head, if the CurrState and Trigger conditions are met. < denotes “move left”, > denotes “move right, and = means “don’t move”.

  • NextState – is the state the Turing Machine should take if the CurrState and Trigger conditions are met. If = is specified, there is no state change. If HALT is specified, or if the specified NextState is not found, the program ends.




A More Complex Script

Another reason I made the Turing Machine program was to explore a thought experiment - what would it be like to implement a more modern software stack on top of a processing core for which the Turing Machine was the fundamental method of computation? Sorta like with modern computers, where we build abstraction layers on top of abstraction layers, from logic gates, to machine code, to micro code, to assembly language, to programming languages, and beyond. What if I did this same thing, but started with the Turing Machine? Could I get to the point of having some human readable language that gets compiled to turing state and transition code, using a fully self-hosted (ie, running on a Turing Machine) tool chain? What would it be like? Could it be used to solve real world problems?

Interesting stuff and fun to think about, but unfortunately, I didn’t get too far with it. I’m too easily distracted with shiny objects to spend much time on any one thing. Still, before I moved on, I did create a few non-trivial scripts, to test the machine, and also just to get a feel for Turing Machine programming in general.

Here’s an example of one such script. This program takes in two binary numbers and adds them together. An essential component in any computation stack. Had I gotten farther in my aforementioned thought experiment, this might have been my adder. But apart from that, I think it stands on its own as an interesting example of what a more complex Turing Machine script looks like.

//---------------------------------------
//
// Add Binary Numbers
// This is a script to add binary numbers.
// 
// Input:
// 		'a', 'b', and 'c' are place markers.  All are required.
// 		This adds the binary number after 'a' to the binary number after 'b'
// 		and places the result after c
//		Both a and b binary numbers should be equal length.  
//		Behavior is undefined if they differ.
// 		The binary numbers in all cases are represented in reverse reading
// 		order so that the left side is the least significant bit (LSB).
//
// Example:
//
// 		this input
//			input   a00010110  b10101110  c
//
//		gives this output
//			a00010110  b10101110  c10111011
//
//---------------------------------------




input   a00010110  b10101110  c




// Move left until the 'a' marker is found.
// Once found, move right until a 0/1 is found
// Then set the 0/1 to o/i
// Then proceed to appending o/i
// If b is reached, then we are done.  Clean up.
|----+-----------+---------+-------+------+------------------|
| // | CurrState | Trigger | Write | Move | NextState        |
|----+-----------+---------+-------+------+------------------|
|    | Find_a    | ***     | =     | <    | =                |
|    | "         | a       | =     | =    | Get_LSB_a        |
|    | Get_LSB_a | ***     | =     | >    | =                |
|    | "         | 0       | o     | =    | Append_o_after_c |
|    | "         | 1       | i     | =    | Append_i_after_c |
|    | "         | b       | =     | =    | Cleanup          |
|----+-----------+---------+-------+------+------------------|


// Move right until c marker is found.
// The proceed to appropriate Append state
|----+------------------+---------+-------+------+-----------|
| // | CurrState        | Trigger | Write | Move | NextState |
|----+------------------+---------+-------+------+-----------|
|    | Append_o_after_c | ***     | =     | >    | =         |
|    | "                | c       | =     | =    | Append_o  |
|    | Append_i_after_c | ***     | =     | >    | =         |
|    | "                | c       | =     | =    | Append_i  |
|----+------------------+---------+-------+------+-----------|


// Move right until \0 or i is found
// if \0 is found, replace it with o
// if i is found, leave it
|----+-----------+---------+-------+------+-----------|
| // | CurrState | Trigger | Write | Move | NextState |
|----+-----------+---------+-------+------+-----------|
|    | Append_o  | ***     | =     | >    | =         |
|    | "         | \0      | o     | =    | Find_b    |
|    | "         | i       | i     | =    | Find_b    |
|----+-----------+---------+-------+------+-----------|


// Move right until \0 or i is found
// if \0 is found, replace it with i
// if i is found, replace it with o, and carry an i 
|----+-----------+---------+-------+------+----------------|
| // | CurrState | Trigger | Write | Move | NextState      |
|----+-----------+---------+-------+------+----------------|
|    | Append_i  | ***     | =     | >    | =              |
|    | "         | \0      | i     | =    | Find_b         |
|    | "         | i       | o     | =    | Carry_i_from_a |
|----+-----------+---------+-------+------+----------------|


// Move right until \0 is found
// then append an i
// and proceed to find b
|----+----------------+---------+-------+------+-----------|
| // | CurrState      | Trigger | Write | Move | NextState |
|----+----------------+---------+-------+------+-----------|
|    | Carry_i_from_a | ***     | =     | >    | =         |
|    | "              | \0      | i     | =    | Find_b    |
|----+----------------+---------+-------+------+-----------|


// Move left until the 'b' marker is found.
// Once found, move right until a 0/1 is found
// Then set the 0/1 to o/i
// Then proceed to appending o/i
// If c is reached, then we are done.  Clean up.
|----+-----------+---------+-------+------+--------------|
| // | CurrState | Trigger | Write | Move | NextState    |
|----+-----------+---------+-------+------+--------------|
|    | Find_b    | ***     | =     | <    | =            |
|    | "         | b       | =     | =    | Get_LSB_b    |
|    | Get_LSB_b | ***     | =     | >    | =            |
|    | "         | 0       | o     | =    | Add_0_To_End |
|    | "         | 1       | i     | =    | Add_1_To_End |
|    | "         | c       | =     | =    | Cleanup      |
|----+-----------+---------+-------+------+--------------|


// Move right until 'c' marker is found.
// Then start serching for o/i, moving right until found.
// Once found, replace o/i with 0/1
// Then start the whole process over again, going back to Find_a
|----+--------------+---------+-------+------+-----------|
| // | CurrState    | Trigger | Write | Move | NextState |
|----+--------------+---------+-------+------+-----------|
|    | Add_0_To_End | ***     | =     | >    | =         |
|    | "            | c       | =     | =    | Add_0     |
|    | Add_0        | ***     | =     | >    | =         |
|    | "            | o       | 0     | =    | Find_a    |
|    | "            | i       | 1     | =    | Find_a    |
|----+--------------+---------+-------+------+-----------|


// Move right until 'c' marker is found.
// Then start serching for o/i, moving right until found.
// Once found, replace o/i with 0/1, but keep in mind that
// we are adding 1.  so o becomes 1, and 1 becomes 0, but with a carry.
|----+--------------+---------+-------+------+----------------|
| // | CurrState    | Trigger | Write | Move | NextState      |
|----+--------------+---------+-------+------+----------------|
|    | Add_1_To_End | ***     | =     | >    | =              |
|    | "            | c       | =     | =    | Add_1          |
|    | Add_1        | ***     | =     | >    | =              |
|    | "            | o       | 1     | =    | Find_a         |
|    | "            | i       | 0     | =    | Carry_i_from_b |
|----+--------------+---------+-------+------+----------------|


// Move right until \0 is found
// Then append an i
// Then start the whole process over again, going back to Find_a
|----+----------------+---------+-------+------+-----------|
| // | CurrState      | Trigger | Write | Move | NextState |
|----+----------------+---------+-------+------+-----------|
|    | Carry_i_from_b | ***     | =     | >    | =         |
|    | "              | \0      | i     | =    | Find_a    |
|----+----------------+---------+-------+------+-----------|


// Go the 'a' marker
// Then go right, replacing all o/i with 0/1
// HALT when \0 is reached.
|----+-----------------+---------+-------+------+-----------------|
| // | CurrState       | Trigger | Write | Move | NextState       |
|----+-----------------+---------+-------+------+-----------------|
|    | Cleanup         | ***     | =     | <    | =               |
|    | "               | a       | =     | =    | Replace_o_and_i |
|    | Replace_o_and_i | ***     | =     | >    | =               |
|    | "               | o       | 0     | >    | =               |
|    | "               | i       | 1     | >    | =               |
|    | "               | \0      | =     | =    | HALT            |
|----+-----------------+---------+-------+------+-----------------|




Appendix: .tm Script Syntax

The following documentation describes the syntax for defining code and input for the Turing Machine program. This same documenation is included with the source code (see TmMain.cpp), and is recreated here for convenience.

A .tm file consists of two parts.  
   1. The initial symbols on the tape, which are used as the program's 
      input data.  
   2. The turing machine program code, which defines the states, 
      triggers, and tape head instructions


---- The input data ----

Use the keyword "input" followed by a string to set the initial tape
data.  Some examples:

	   input 101101
	   input This is my input string



---- The program code ----

A Turing Machine is a Finite State Machine (FSM).  Thus it is a
collection of States.  Each State specifies a list of Trigger Symbols,
and for each Trigger Symbol there is an associated set of Instructions
to perform, which are to (optionally) write a Symbol, (optionally)
move the tape head, and (optionally) change to a new State.

The program code syntax captures the above setup in rows, with each
row mapping a State and Trigger Symbol pair to a set of Instructions.

A single row is specified using these value types in this order:

	[CurrState]  [Trigger]  [Write]  [Move]  [NextState]

with the types defined as follows:

	CurrState  -  The name of the state the Turing Machine is currently in.
				  The name can be any single word, with no spaces.  It is 
                  NOT case sensitive.
				  If '"' (double quote) is specified as the name, then the 
                  most recently parsed state name is assumed.
				  The first state listed in the program code is considered 
                  the starting state.

	Trigger    -  A symbol.  Case sensitive.  If this matches the value 
                  currently under the "tape head" then execute the associated 
                  Write, Move, and NextState Instructions.
				  The symbol must be a single character, except for:
					  the string "\0", which indicates integer 0 (instead of 
                          the ascii character 0).
					  the string "default", which means "use this trigger, 
                          if no other matching trigger is found".  It is a 
                          "catch all" that works like the "default" keyword in 
                          a C++ switch statement.
					  the string "***" which does the same thing as "default"

	Write      -  A symbol.  Case sensitive.  This gets written to the Tape 
                  when CurrState and Trigger conditions are both met.
				  Must be a single character.  
				  If '=' is specified, nothing is written.

	Move       -  The direction to move the Tape Head when CurrState and
                  Trigger conditions are both met.
					  <, L, l  each mean, "move left"
					  >, R, r  each mean, "move right"
					  =, N, n  each mean, "don't move"

	NextState  -  The name of the State to transition into when CurrState and 
                  Trigger conditions are both met.
				  The name can be any single word, with no spaces.  It is NOT 
                  case sensitive.
				  If the specified state does not exist, the program will give 
                  a warning and automatically halt.  
				  If '=' is specified, the next state is the current state (ie, 
                  no state change occurs).
				  If 'HALT', is specified, the program will halt.

With all this then, a valid 1-State Turing Machine program might look like:

		  State0    1    =      >     =         
			"       0    1      >     =         
			"      \0    =      =     Halt

Additionally, the syntax allows for some special characters to help make 
.tm files and their code within made more readable.  The special characters 
are:

	 "//"      -  this specifies a comment.  Like C++, once this string is 
                  encountered, any following text on the same line is ignored.
	 +, -, |   -  these characters are always ignored.  This makes it possible 
                  to use Emacs to draw tables around the code.

Thus, the example program above can be equivalently specifed as:

		 |----+-----------+---------+-------+------+-----------|
		 |    | CurrState | Trigger | Write | Move | NextState |
		 |----+-----------+---------+-------+------+-----------|
		 |    | State0    |       1 | =     | >    | =         |
		 |    | "         |       0 | 1     | >    | =         |
		 |    | "         |      \0 | =     | =    | Halt      |
		 |----+-----------+---------+-------+------+-----------|

Since all the Emacs table stuff is invisible to the compiler, rows of
the program can be organized into separate tables as needed.  That is,
a single .tm program can be visually presented as many different
tables, (which is useful for oranization and readability), but
internally, all tables in a single .tm are part of a single program.