Turing Machine in SQL (2/5)
This is the second of five posts on this subject.
In a previous post I presented how to implement a Turing Machine (TM) with an ARRAY to store the tape contents. This solution is not really relational, so in this post I’ll show how to build a TM with SQL-only functions.
Using SQL Functions
In this post the TM is built from the following SQL features: WITH RECURSIVE
to iterate till the machine stops, INNER JOIN
to get transition and state
informations, a TABLE
to store the evolving tape contents, SQL functions with
a LATERAL JOIN
to INSERT
into and UPDATE
and SELECT
from the tape.
A SEQUENCE
and an ORDER
clause is used in the code, but they are not
strictly necessary.
Also, an ARRAY
is used to record the tape state as the machine is executed,
but is not strictly necessary either, it is there for displaying the TM
execution summary at the end.
TM Tape
The TM tape will be stored in a TABLE
:
CREATE TABLE Turing.RunningTape(
tid SERIAL PRIMARY KEY,
symbol INTEGER NOT NULL REFERENCES Turing.Symbol
);
INSERT INTO Turing.RunningTape(symbol)
SELECT symbol FROM Turing.Tape ORDER BY tid;
This running tape will be updated and accessed through two SQL functions:
-- update tape as a SQL function side effect
CREATE OR REPLACE FUNCTION
Turing.updRunningTape(pos INTEGER, nsymbol INTEGER) RETURNS INTEGER[]
VOLATILE STRICT AS $$
-- ensure that the tape is long enough, symbol 0 is blank
INSERT INTO Turing.RunningTape(symbol) VALUES(0);
-- update tape contents
UPDATE Turing.RunningTape SET symbol=nsymbol WHERE tid=pos;
-- return tape as an array for later display
SELECT ARRAY(SELECT symbol FROM Turing.RunningTape ORDER BY tid)
$$ LANGUAGE SQL;
-- get table contents from a function...
CREATE OR REPLACE FUNCTION
Turing.getRunningTape(pos INTEGER) RETURNS Turing.RunningTape
VOLATILE STRICT AS $$
SELECT * FROM Turing.RunningTape WHERE tid=pos;
$$ LANGUAGE SQL;
In the first function, a blank symbol is inserted on each iteration so as to ensure that we do not try to access the tape out of its defined content. Note that the second function which returns the current symbol is necessary because of the peculiar behavior of WITH RECURSIVE which is discussed later.
TM Execution
Let us now record a run with a recursive query:
-- iteration, state, tape (for record, not used!), position
WITH RECURSIVE running(rid, sid, tape, pos) AS (
-- set first iteration at state 0, position 1
SELECT 0, 0, ARRAY(SELECT symbol FROM Turing.RunningTape ORDER BY tid), 1
UNION
-- compute next iterations
SELECT pr.rid+1, tr.new_state,
-- this function relies on side effects to update the tape
-- the tape is stored as an array just for showing it later
Turing.updRunningTape(pr.pos, tr.new_symbol),
pr.pos + tr.move
FROM -- previous state, THE one reference to running
running AS pr
CROSS JOIN LATERAL -- current symbol from tape
-- does not work: Turing.RunningTape AS tp ON (tp.tid=pr.pos)
Turing.getRunningTape(pr.pos) AS tp
JOIN -- corresponding transition
Turing.Transition AS tr ON (tr.sid=pr.sid AND tr.symbol=tp.symbol)
JOIN -- state information, necessary to know whether to stop
Turing.State AS st ON (st.sid=tr.sid)
WHERE -- stop on a final state
NOT st.isFinal
)
-- just stores the computed table
INSERT INTO Turing.Run
SELECT * FROM running;
Note that on each iteration, the next iterations seems to be recomputed over
and over, which would result in bad results as the side effects on RunningTape
would be cumulated… However this is not actually the case, as WITH RECURSIVE
only iterates on new tuples, those that have just been generated by the previous
iteration, and ignores those of other preceding iterations.
A more surprising behavior, ISTM undocumented, is that RunningTape
contents
is not seen to change from the recursive query as iterations are processed, as
if it was in a SERIALIZABLE
transaction, but the contents seen through the upd
and get functions does change over time.
The effect of this behavior is that if the get function is replaced with a
straightforward JOIN
the values fetched are those of the initial table, that
is they are not updated, thus both function are really necessary.
From a transaction perspective, it seems that the functions are executed in a
different realm outside of the WITH
query.
It really looks like a bug to me, but I was told that it is a feature. Hmmm.
You can try
this self-contained SQL script which
implements a Turing Machine for accepting the \(A^nB^nC^n\) language using the
above method. Postgres 9.3 is required because of the LATERAL JOIN
.
In the next post, I show how to put the tape directly in the recursive table and get rid of functions, so that the TM is really relational and does not use SQL functions with side effects.