Turing Machine in SQL (2/5)

This is the second of five posts on this subject.

In a previous post I presented how to implement a Turing Machine (TM) with an ARRAY to store the tape contents. This solution is not really relational, so in this post I’ll show how to build a TM with SQL-only functions.

Using SQL Functions

In this post the TM is built from the following SQL features: WITH RECURSIVE to iterate till the machine stops, INNER JOIN to get transition and state informations, a TABLE to store the evolving tape contents, SQL functions with a LATERAL JOIN to INSERT into and UPDATE and SELECT from the tape.

A SEQUENCE and an ORDER clause is used in the code, but they are not strictly necessary. Also, an ARRAY is used to record the tape state as the machine is executed, but is not strictly necessary either, it is there for displaying the TM execution summary at the end.

TM Tape

The TM tape will be stored in a TABLE:

CREATE TABLE Turing.RunningTape(
  tid SERIAL PRIMARY KEY,
  symbol INTEGER NOT NULL REFERENCES Turing.Symbol
);

INSERT INTO Turing.RunningTape(symbol)
  SELECT symbol FROM Turing.Tape ORDER BY tid;

This running tape will be updated and accessed through two SQL functions:

-- update tape as a SQL function side effect
CREATE OR REPLACE FUNCTION
  Turing.updRunningTape(pos INTEGER, nsymbol INTEGER) RETURNS INTEGER[]
VOLATILE STRICT AS $$
  -- ensure that the tape is long enough, symbol 0 is blank
  INSERT INTO Turing.RunningTape(symbol) VALUES(0);
  -- update tape contents
  UPDATE Turing.RunningTape SET symbol=nsymbol WHERE tid=pos;
  -- return tape as an array for later display
  SELECT ARRAY(SELECT symbol FROM Turing.RunningTape ORDER BY tid)
$$ LANGUAGE SQL;

-- get table contents from a function...
CREATE OR REPLACE FUNCTION
  Turing.getRunningTape(pos INTEGER) RETURNS Turing.RunningTape
VOLATILE STRICT AS $$
  SELECT * FROM Turing.RunningTape WHERE tid=pos;
$$ LANGUAGE SQL;

In the first function, a blank symbol is inserted on each iteration so as to ensure that we do not try to access the tape out of its defined content. Note that the second function which returns the current symbol is necessary because of the peculiar behavior of WITH RECURSIVE which is discussed later.

TM Execution

Let us now record a run with a recursive query:

-- iteration, state, tape (for record, not used!), position
WITH RECURSIVE running(rid, sid, tape, pos) AS (
  -- set first iteration at state 0, position 1
  SELECT 0, 0, ARRAY(SELECT symbol FROM Turing.RunningTape ORDER BY tid), 1
  UNION
  -- compute next iterations
  SELECT pr.rid+1, tr.new_state,
    -- this function relies on side effects to update the tape
    -- the tape is stored as an array just for showing it later
    Turing.updRunningTape(pr.pos, tr.new_symbol),
    pr.pos + tr.move
  FROM -- previous state, THE one reference to running
       running AS pr
  CROSS JOIN LATERAL -- current symbol from tape
       -- does not work: Turing.RunningTape AS tp ON (tp.tid=pr.pos)
       Turing.getRunningTape(pr.pos) AS tp
  JOIN -- corresponding transition
       Turing.Transition AS tr ON (tr.sid=pr.sid AND tr.symbol=tp.symbol)
  JOIN -- state information, necessary to know whether to stop
       Turing.State AS st ON (st.sid=tr.sid)
  WHERE -- stop on a final state
        NOT st.isFinal
)
-- just stores the computed table
INSERT INTO Turing.Run
  SELECT * FROM running;

Note that on each iteration, the next iterations seems to be recomputed over and over, which would result in bad results as the side effects on RunningTape would be cumulated… However this is not actually the case, as WITH RECURSIVE only iterates on new tuples, those that have just been generated by the previous iteration, and ignores those of other preceding iterations.

A more surprising behavior, ISTM undocumented, is that RunningTape contents is not seen to change from the recursive query as iterations are processed, as if it was in a SERIALIZABLE transaction, but the contents seen through the upd and get functions does change over time. The effect of this behavior is that if the get function is replaced with a straightforward JOIN the values fetched are those of the initial table, that is they are not updated, thus both function are really necessary. From a transaction perspective, it seems that the functions are executed in a different realm outside of the WITH query. It really looks like a bug to me, but I was told that it is a feature. Hmmm.

You can try this self-contained SQL script which implements a Turing Machine for accepting the \(A^nB^nC^n\) language using the above method. Postgres 9.3 is required because of the LATERAL JOIN.

In the next post, I show how to put the tape directly in the recursive table and get rid of functions, so that the TM is really relational and does not use SQL functions with side effects.