FORMALIZATION OF BACKTRACKING IN FORTH
M.L.Gassananko
St.Petersburg, Russia
mlg@iias.spb.su
This work was partially supported by the grant INTAS 93-845
In our previous work we have presented a formalization of control
transfers due to return stack manipulations. In this paper we show that
this formalism is enough powerful to describe changing to a different
execution scheme--backtracking. We also mention some more possible
execution schemes.
1. INTRODUCTION
The main idea of the techniques, formalization of which we present in
this paper is the following approach: first describe a problem in
concepts natural to it, then implement these concepts, along with an
interpreter for them. Shortly after the advent of first computers, it
was realised that the concepts used directly by the processor, and the
order in which computer must perform actions are inappropriate for
reasoning about problems. Then appeared high-level languages, and
problem-oriented languages; it was realised that procedures must express
concepts of the problem rather than just save memory; finally, such
techniques as backtracking and data-driven approach not only map problem
concepts onto program units, but use a special execution mechanism for
them. (OOP and Forth also may be considered as steps in the same
direction).
A classical model Forth system is well-suited for the studies on the
mechanisms of execution, since its interpreter data structures are open
and may be accessed and changed by programs. This unique feature is very
important; unfortunately, it is not in the ANSI standard, probably,
partially because it cannot be found in other languages.
During the last 5 years theories of stack effects checking and
type inference were developed [Poi94 (review), Poi91, Poi93, StK92,
StK93, Sto93], but these formalisms do not address manipulations with
executable code addresses.
In [Gas95] we have presented a formalism that describes how
manipulations with return stack information reflect in control flow
changes. In [Gas93R,Gas94] we have presented BacFORTH--a Forth extension
that supports backtracking. Backtracking is implemented there via return
stack manipulations, as well as a number of unstandard control
structures. The use of backtracking implies some additional discipline
with respect to the return stack and to the stack of continuations:
success is implemented as a call of continuation (i.e. of the residue of
the caller procedure code), failure is a return from continuation, the
return stack contains backtracking information while the continuation
stack holds continuation addresses that cannot be anymore kept on the
return stack. The discipline implies that data must not interfere with
the backtracking information (it is possible to keep data on the
continuation stack instead of the return stack), that one should take
into account what happens when backtracking (especially with respect to
the stack balance); each backtrackable procedure assumes that when it
receives control, the return stack top contains the address to which
control must backtrack on failure; each backtrackable procedure is
allowed to leave any information on the return stack, but the top value
must point to the code that must receive control when backtracking.
The rest of this section briefly describes the [Gas95] paper.
1.1 THE FORMALISM: MOTIVATION AND GOALS
The fundamental difference between data and return addresses
manipulations is that return addresses manipulations may
result in control transfers.
The return addresses manipulations are proven to be a very
powerful technique and are an important part of Forth since
they enable us to define both classical and unstandard control
structures and to implement new methods of program execution,
such as backtracking, self-processing data, etc.
Unfortunately, return addresses manipulations today are often
considered as a bad style, since they are not present in other
languages, not formalised and not included into standards.
We need a calculus that provided a formal description of
existing (and, probably, yet unknown) techniques that use
return addresses manipulations to organize code execution.
Examples (for a classical system):
: BRANCH R> @ >R ; \ ( IP: dest -- ) ( ip:=dest )
: UNNEST RDROP ; \ ( R: a -- ) ( ip:=a ) a.k.a. EXIT
\ Backtracking:
: ENTER >R ; \ ( addr -- ) call the code at addr
: 1-10 ( -> n n ) ( <- n ) \ look over numbers 1..10
1 BEGIN DUP R@ ENTER 1+ DUP 11 = UNTIL DROP RDROP ;
: TEST \ print the numbers from 1 to 10
1-10 . ;
More problems: threaded code may contain both code and data.
Example:
\ LIT fetches a value from the threaded code
\ and leaves it on the stack
: LIT ( -- x ) ( IP: x -- )
R@ @ R> CELL+ >R ;
Such word is compiled before any integer that appears in
a colon definition. The definition:
: X 5 . ;
compiles to:
[ code field ] [ LIT ] [ 5 ] [ . ] [ EXIT ]
So, we need a criterion to distinguish between code and data.
1.2 THE FORMALISM: BASIC NOTIONS
a) TCE - Threaded Code Element. Either a compiled token of a
word or a data element (those typically follow compiled
words that process them). TCEs are of known (probably,
different) sizes.
TCF - Threaded Code Fragment. A sequence of TCEs, typically
is processed by the threaded code interpreter. We cannot
predict the length of a TCF.
TC - an abbreviation for "threaded code"
b) Position.
The difference between code and data is the way of processing.
Code is processed only by the code interpreter. Data get
processed by other functions and not by the code interpreter.
A TCE may be in one of the following 4 positions:
Active, if it is code: it gets processed only by the code
interpreter.
Passive, if it is data: it gets processed by other
methods.
Mixed, if is both code and data: it gets processed in
both ways (it may be self-modifying code or code
considered as data).
Indifferent, if it is never processed at all (we may call it
garbage).
c) Equivalence of TCFs. (functional)
DUP SWAP DROP == NOOP
OVER SWAP == >R DUP R>
Two code fragments are equivalent if each of them in any context in
active position may be substituted by another.
2 DEFINITION OF HIGER-ORDER CODES
Abbreviations:
TC - threaded code
TCF - threaded code fragment
Ordinary threaded code is called 1-order threaded code.
0) Construction of higher-order threaded code starts from the 1-order
threaded code. There are following notions:
the continuation (for a procedure, when it is being invoked) is the
resudue of TC in the TCF that invokes this procedure (at the moment
of invocation the continuation address is in IP);
success (in general) is a control transfer to the continuaion (this
may be done in different ways: a jump, a call, etc.);
exit (in general) also is a control transfer to the continuaion (exit
and success are different actions; when exit is a jump, then success
is a call)
1-return stack (1-continuation stack) is the return stack;
1-exit (1-return) - is the traditional EXIT, i.e. a control transfer
to the address taken off the return (1-continuation) stack;
1-call is the traditional call of a TCF: the current IP value is
placed onto the 1-return stack, and IP gets set to the start of the
TCF;
1-trace stack does not exist.
(i+1)-order threaded code is built from the i-order threaded code in the
following way:
1) i-success is an i-call of thecontinuation;
2) i-return (i-backtracking) is an i-exit from the continuation;
3) i-failure (of a procedure) is an action that forces i-backtracking
from the TCF that invoked the procedure;
4) one more stack is created: (i+1)-return ((i+1)-continuation) stack;
5) (i+1)-trace stack is the i-continuation stack;
6) i-reversible operation is such operation that performs reversal
actions when i-backtracking;
7) (i+1)-call: place (i-reversibly) the continuation address onto the
i-continuation stack; when i-backtracking, remove it from there and
force i-failure of the procedure;
8) the i-PRO action complements i-call to the (i+1)-call (i.e. moves the
continuation address from the i-continuation stack to the
(i+1)-continuation stack, when backtracking it discards it from there
and forces i-failure)
the i-CONT action is i-success (i-call of continuation) using the
continuation address at the (i+1)-continuation stack;
9) (i+1)-exit is i-success; when control i-returns from the
continuation, i-backtracking is raised (i.e. this is i-CONT i-EXIT);
For example, for 2-order TC:
1) 1-success is a call of continuation
2) 1-return (1-backtracking) is a return from the continuation;
3) 1-failure forces a return from the TCF that invoked the failing
procedure;
4) 2-continuation stack is L-stack (with the access words >L , L> ,
etc.);
5) 2-trace stack is the return stack;
6) a reversible operation performs reversal actions when backtracking;
7) 2-call: place the continuation address onto L-stack; when
backtracking remove the continuation address and raise backtracking
again;
8) the 1-PRO action is implemented as:
: ENTER >R ;
: PRO R> R> >L ENTER L> DROP ;
the 1-CONT action is implemented as:
: CONT L> >R R@ ENTER R> >L ;
9) 2-exit is CONT EXIT .
3. BASICS OF THE FORMALISM
3.1 THE TERMS AND THE NOTATION
Following the Forth tradition, the term "stack" will refer by default to
the data stack, while other stacks will be termed by their full names.
We will use the following latin letters to denote:
t..z --- threaded code fragments (TCF);
f,g,h --- functions (occupying exactly one threaded code element);
M,N --- integer literals;
P,Q --- properties (feaures);
i..n --- natural numbers.
The meanings of separate signs:
[ and ] denote the boundaries of a TCF;
' - address in active position;
^ - address in passive position;
, - a single threaded code element;
r - mention of the return stack;
l - mention of the L-stack (the stack of continuations [Gas94]);
: - having a feature;
! - assertion.
These signs are the elements from which all other symbols are
assembled. In the Forth language all they have a different meaning.
As in Forth, the symbols are separated by spaces.
Equivalence will be denoted by the symbol == and will be understood as
possibility to substitute one threaded code fragment for the other in
active position in any context. The symbol =/= will denote
inequivalence.
The =>= sign will be used in the cases where substitution is possible,
but some inormation gets lost:
A =>= B
means that (probably, in view of some other statements)
X == A => X == B
(the => symbol denotes implication)
The notation
x:P
may read as "x has the property P" or "x that has the property P". When
some properties are consequences of other properties, we will use the
symbol :! to underline that some properties refer to the consequence
rather than to the premise. For example, x:S2:!S1 is a concise notation
for x:S2 => x:S1 (translation: if a TCF ignores (neither affects nor
depends upon) two top data stack elements, it is also a fragmwnt that
ignores one top stack element).
When necessary, we will bracket the code fragment to which the property
applies. For example, [ x:R y:R ]:!R (translation: the code fragment
[ x y ], composed from two code fragments that ignore the return stack,
also ignores the return stack).
Now we give explanation of other symbols:
; - denotes EXIT
[r x '] y - call of the threaded code fragment x with the
'r[ y ] x continuation y; that is, place the address of y onto
the return stack and transfer control to x. The TCF y
is in active position.
[r x ^] y - call of the threaded code fragment x with the
^r[ y ] x continuation y; that is, place the address of y onto
the return stack and transfer control to x. The TCF y
is in inactive position.
'[ x ] - place address of the threaded code fragment x, in
active position, onto the data stack.
^[ x ] - place address of the threaded code fragment x, in
inactive position, onto the data stack.
,'[ x ] - a threaded code element that contains a reference to a
TCF in active position.
,^[ x ] - a threaded code element that contains a reference to a
TCF in inactive position.
,[ x ] - the same as .'[ x ]
^,^[ x ] - the address of a threaded code element that contains a
reference to the TCF x in inactive position.
^,[ x ] - the address of a threaded code element that contains a
^,'[ x ] reference to the TCF x in inactive position.
pre u x - The TCF x is located in memory next to the TCF u.
or
pre[ u ] x
,f - Compilation token of the function f (only one TC
element)
Act: x - The TCF x is in active position.
!Act: x - Assertion that x is in active position
%A - universal quantifier
%E - existential quantifier
We may change from "^" to "'" assuming that address is in active
position. We cannot change from "'" to "^", though, since doing so we
would assume that not only the interpreter, but all the methods of
processing do not distinguish between equivalent (to interpreter)
fragments, which is not true.
The use of functions in code fragments without preceding commas ","
underlines that the actual number of compiled tokens is not important.
Strictly speaking, the Act: property is a property not of a code
fragment but of its context. It is convenient to assign it to an object
in the context, though.
Special means.
We will consider each formula as surrounded by special symbols "<<" and
">>", if they do not appear in it. By default, these symbols may be
substituted by any code fragment. Hence x == y is a concise notation
for %A << %A >> << x >> == << y >> (which conforms with the axiom 7
given below).
Example:
[r x '] == 'r[ >> ] x
means
<< [r x '] >> == << 'r[ >> ] x
For repeating fragments we will use the following notation:
( repeat_specification ){ repeating_fragment }
For example, we will write
(5){ N } instead of N N N N N
(i=1..5){ N[i] } instead of N[1] N[2] N[3] N[4] N[5]
and so on.
Along with this notation in some cases we will use the more customary
one (namely, in cases where it is shorter), for example:
N1 .. Nm is the same as (i=1..m){ Ni } .
Note. The BacFORTH system [Gas94] includes the contol structures
START...EMERGE and BACK...TRACKING:
x START y EMERGE z == x [r y ; '] z
x BACK y TRACKING z == x 'r[ y ; ] z
Comments [not necessarily strict] in English are given in parenteses,
that facilitate understanding but not always may serve instead of
formal text and are not a part of the theory.
3.2 BASIC AXIOMS
(1) '[ t >> ] >R ; == t
(placing a TCF address onto return stack and transferring control
to it via EXIT is equivalent to execution of this TCF)
(2) [r t '] u == 'r[ u ] t
(there are two notations for placing a TCF address onto return
stack)
(3) If x is defined as
: x t ;
then
x == [r t ; ^]
(4) ^r[ w ] RUSH> ,f == 'r[ ,f w ] ;
( RUSH> is equivalent to exit to a threaded code fragment
that starts with and continues like that which address is
on the return stack top; see also the section 2.5)
(5) 'r[ t ] r> == '[ t ]
(the word R> moves a TCF from return stack to data stack)
(6) ( %A N : N x == N y ) ==> x == y
(if for all integer N, N x is equivalent to N y, then x and y are
equivalent)
(7) x == y <=> ( %A u,v : u x v == u y v )
(x is equivalent to y, if and only if they have the same behaviour
in any context (in active position); this is the definition of
equivalence)
(8) ^[ Act: x ] == '[ x ]
(the tick "'" means an active position)
(9) pre[ u ] x =>= x
(if a TCF is in active position, the interpreter does not process
the addresses before it (the interpreter always "goes forwards");
we can substitute pre[ u ] x for x, but we lose some information
about the exact contents of memory)
(10') ^[ ,'[ u ] v ] REF@ == '[ u ]
(10^) ^[ ,^[ u ] v ] REF@ == ^[ u ]
(the word REF@ receives the address of reference to a TCF, and leaves
the address of that TCF)
(11) ^[ ,[ u ] v ] REF+ == ^[ pre,[ u ] v ]
(the word REF@ receives the address of a TCF that starts with a
reference (to another TCF), and leaves the address of code that
follows the reference)
(12) %A t [ ; t ] == ;
(an EXIT makes the interpreter ignore the code that follows the
EXIT)
(13) Application of data processing operations to addresses in
active position is a contradiction.
Examples:
'[ u ] REF@ --- contradiction
'[ u ] REF+ --- contradiction
(14) %A t == u ; : %E v,w : t == u ; == v 'r[ w ] ;
(an EXIT assumes an address in active position on the return stack
top; even if the program has a bug, we may think of the return
stack top as an address of code that hangs the system up; we will
not examine possible equivalence of different hangups).
Note. EXIT assumes a TCF address on the return stack top, its position
is either active or mixed.
3.3 DEFINITIONS I
Here we repeat basic definitions from [Gas95]. It turns out that in
static analysis of return addresses manipulations, one of the most
important properties of a code fragment is to ignore (neither affect nor
depend on) some values on the stack(s).
Def.1. S1( t ) is a TCF (threaded code fragment) such that
%A N : ( N S1( t ) == t N )
( S1( t ) behaves like t , but ignores the value on the data stack
top)
Examples:
S1( DROP ) == NIP
(Indeed, %A N N NIP == DROP N )
S1( SWAP ) == >R SWAP R>
Def.2. S[m]( t ) is a TCF such that
%A N1,...,Nm :
( N1 ... Nm S[m]( t ) == t N1 ... Nm )
We will omit brackets when concrete numbers substitute for m, i.e. we
will write S3 instead of S[3] .
( S[m]( t ) behaves like t but ignores m top stack elements)
Def.3. We will write t:S[m] , if %E y : t == S[m]( y ) .
Note. We will use the signature t:S[m] in the same way as adverbs and
adjectives are used in natural languages, e.g.:
%A N,M : N t:S1 DROP == M t DROP
( t:S[m] -- TCF t that ignores m top stack elements)
Def.4. We will write t:S , if t == S1( t ) .
( t:S ignores stack depth and contents)
Def.5. S[-1]( t ) is a TCF such that
S1( S[-1]( t ) ) == t
(the "inverse operation to S1( )" )
Note. S[-1]( t ) exists not for all t .
Def.6. S[-m]( t ) is a TCF such that
S[m]( S[-m]( t ) ) == t
If t:S[m] , i.e. %E y : t == S[m]( y ) , then y == S[-m]( t ) .
In a similar maner we introduce the notions of R1, R[m], R[-m], t:R[m],
t:R.
Def.7. R1( t ) is a TCF such that
%A N : ( N >R R1( t ) == t N >R )
( R1( t ) behaves like t , but ignores the value on the return stack
top)
Def.8. R[m]( t ) is a TCF such that
%A N[1], ... N[m] :
(i=1..m){ N[i] >R } R[m]( t ) == t (i=1..m){ N[i] >R }
Def.9. We will write t:R[m] , if %E y : t == R[m]( y ) .
Def.10. We will write t:R , if t == R1( t ) .
(t ignores the return stack; this property is very important for us)
Def.11. R[-1]( t ) is a TCF such that
R1( R[-1]( t ) ) == t
(the "reversal operation" to R1( ) )
Def.12. R[-m]( t ) is a TCF such that
R[m]( R[-m]( t ) ) == t
Examples.
S1( DROP ) == >R DROP R> == NIP
S[-1]( NIP ) == DROP
S1( EXIT ) == DROP EXIT
[ SWAP ]:R
[ R> DROP ]:S
[ ABORT ]:R:S
4 FORMALIZATION OF 2-ORDER (BACKTRACKABLE) THREADED CODE
In this section we describe 2-nd order threaded code. We introduce a
new notion of equivalence (b-equivalence) which will be used to
describe the 2-nd order TC (or 2TC). Usage of 2TC implies some
discipline, and b-equivalence will assume the same discipline
(2-order procedure may fail which results in exiting the code
fragment in which it is used, and therefore the return stack top must
contain a TCF address; a 2-nd order procedure makes no assumptions
about what is below the return stack top; but, nevertheless, it may
assume that other procedures also follow this discipline).
4.1 PROPERTIES OF TC FRAGMENTS THAT PERFORM EXIT
Def.1. t:US1 if %A N : N t == N t DROP N
(the TCF t:US1 does not change (although can use) the top stack
value)
Examples:
[ DUP ]:US1
[ SWAP 1+ SWAP ]:US1
Note. Evidently, t:S1:!US1
Def.2. t:US[m] if %A N1 .. Nm : N1 .. Nm t == N t (m){ DROP } N1 .. Nm
(the TCF t:US[m] does not change (although can use) m top stack
values)
Note. Evidently, t:S[m]:!US[m]
Def.3. t:Ign if %A u : t u == t
( t:Ign ignores its continuation, i.e. does not pass control to
the code that follows it)
Note. t:Ign <=> t == t ;
Indeed,
%A u : t:Ign u == t => t ; == t
and vice versa, from %A u : ; u == ;
and t == t ;
we conclude that
%A u : t u == t
Def.4. t:REI if t:Ign and [r t ']:R
It should be noted that the definition of REI assumes that the
address at the return stack top is in active position (is processes
only by the code interpreter).
Examples:
[ DROP EXIT ]:REI
[ IF 1+ EXIT THEN 1- EXIT ]:REI
Def.5.
In most cases below we will assume that the TCF shown in the formula
receives control when the return stack top contains an address of TC
fragment, which we will denote by r>> . The equations that need this
assumption will be marked by the symbol (r). So, the formula
(r) u == v
is an abbreviation for
%A << %A >> %A r>> : << 'r[ r>> ] u >> == << 'r[ r>> ] v >>
Note. From purely formal point of view, the (r) symbol introduces a
restriction which formally enables us to use the r>> symbol. From the
point of view of underlying practice, the (r) symbol is not an
additional restriction, since the code that requires a TCF address on
the return stack will crash if it is not there.
Def.6.
Let us introduce a similar notation for the assumption that >>:REI :
(e) u == v
is an abbreviation for
>>:REI => u >> == v >>
which, in its turn, is an abbreviation for
%A << %A >>:REI : << u >> == << v >>
Statement.1. [r 'r[ ; ] t:REI '] == [r t ']
Proof.
[r 'r[ ; ] t '] == [r [r t ']:R ; '] == [r t ']
Statement.2. (r) [r t:REI '] ; == t
(r) 'r[ ; ] t:REI == t
Proof.
(r) t == 'r[ r>> ] t == [r t '] r>>
== [r [r t '] ; '] r>> == 'r[ r>> ] [r t '] ; ==
== [r t '] ; == 'r[ ; ] t
Property [ t:R ; ]:!REI
(when we add EXIT to the end of a R-type TC fragment, we get a
REI-type TC fragment)
Statement.3. %A u:REI %E v:R : u == v ;
%A v:R %E u:REI : v == [r u ']
Proof.
In view of the previous theorem and the definition of REI the code
fragments defined as:
v == [r u ']
u == v ; == [r u '] ;
meet the statement of the theorem.
So, there is one-to-one mapping between R-type and REI-type TC
fragments.
(A TCF which performs a call of REI fragment is an R-type
fragment; a REI fragment may be formed from an R-type fragment by
addition of an EXIT to its end.)
Def.7. (b-equivalence)
(b) u == v if
%A << %A r>> %A >>:REI
<< 'r[ r>> ] u >> == << 'r[ r>> ] v >>
In other words,
(b) u == v
means that
(r) (e) u == v
Property. u == v => (b) u == v
Property. (b) u:REI == v:REI => [r u '] == [r v ']
One can note that all mathematics in the proofs of S[m] properties
remain valid if we change equivalence for b-equivalence. Therefore we
can e.g. use comparison theorems, although we can state only
b-equivalence in case that the TC fragments are of RE-type.
Def.8. t:RE if %A v:R : [r t v ; ']:R
(if u:R, then call of the TCF [ t v ; ] gives a TCF of type R )
(indicating %A v:R we require that t does not leave anything but a
TCF address on the return stack top)
Note. Restricting ourselves to RE-type TC fragments, we exclude TC
fragments that use data stack to keep information on which the return
stack effect depends from consideration (it is not difficult to
construct a definition that we probably would like to classify as RE,
but which doesn't fit this definition--e.g. a definition which moves
some return stack values to the data stack and succeeds, and when
backtracking restores them and fails: this definition will crash if
v:R modifies the saved values).
Property. If v == NOOP , then it follows from the axiom 14 that
'r[ ; ] t:RE may be represented as u 'r[ w ] .
Statement.4. t:RE <=> %A u:REI : [ t u ]:REI
(Equivalent formulation of RE-type definition)
Proof.
1) Let us show that t:RE => %A u:REI : [ t u ]:REI
Let %A v:R : [r t v ; ']:R
Let u:REI , then v == [r u ']:R and
[r t v:R ; ']:R == [r t [r u '] ; '] == [r t 'r[ ; ] u '] ==
(since 'r[ >> ] t may be represented as u 'r[ w ] )
== [r t u ']
2) Let us show that from %A u:REI : [ t u ]:REI
follows %A v:R : [r t v ; ']:R
At first, [ t u ]:REI => [r t u ']:R
Then, %A v : [ v:R ; ]:!REI =>
%A v:R : [ t [ v ; ]:REI ]:!REI =>
%A v:R : [r t v ; ']:R
Let us pass on to the discussion of the properties of the RE property.
Property. t:REI:!RE
( REI is a particular case of RE)
Property. t:RE:Ign:!REI
(if a TCF t:RE ignores its continuation, then t:REI)
Property. (b) u:RE == 'r[ ; ] u
The proof follows from that [ u >>:REI ]:!REI
Property. (e) u:RE 'r[ ; ] == u
(This statement may be rather assigned to the properties of
>>:REI)
Property. (b) x:RE == y => y:RE
Proof.
Let y:/RE, i.e. %E v:REI : [ y v ]:/REI
But then
(r) y v == x v, which contradicts to that [ x:RE v:REI ]:REI
Property. If (b) u == v and (b) x:RE == y:RE , then:
1) (b) u x == v x
2) (b) u x == u y
3) (b) u x == v y
Proof.
1) Let u x =/= v x , i.e.
%E << %E r>> %E t:REI
(r) [ u x t ] =/= [ v x t ]
But then, substituting >> == x t into the definition of
b-equivalence, we come to a contradiction to that (b) u == v
(b) u == v
2) By analogy, if u x =/= u y , then we come to a contradiction
substituting << == u into the definition of b-equivalence.
3) (b) u x == v x == v y
Property. t:R:!RE
Property. [ t:RE u:REI ]:!REI
Property. [r t:RE u:REI ']:!R
Property. [ u:RE v:RE ]:!RE
In a similar manner we define the properties S1E, S1EI, US1E, US1EI,
S[m]E and so on:
t:XE if %A u:R:X : [r t u ']:X
t:XRE if %A u:R:X : [r t u ']:X:R
t:XEI if t ; == t and [r t ']:X
t:XREI if t ; == t and [r t ']:X:R
where X stands for some property.
4.2 DEFINITIONS II
Def.1. BDROP == >R 'r[ R> ; ]
DROPB == 'r[ DROP ; ]
Def.2. S1B( t:RE ) is such TCF that
%A N : (e) S1B( t ) 'r[ N ; ] == 'r[ N ; ] t
( S1B( t ) bahaves like t, but does not affect the stack top when
bactracking)
Def.3. S[m]B( t:RE ) is such TCF that
%A N1, ..., Nm
(e) S[m]B( t ) 'r[ N1 ... Nm ; ] == 'r[ N1 ... Nm ; ] t
Def.4. S[-m]B( t:RE ) is such TCF that
S[m]B( S[-m]B( t ) ) == t
(a "reversal operation" for S1B( t ) )
Def.5. t:SB if S1B( t ) == t
Def.6. BS1( t:RE ) is such TCF that
(b) N BS1( t ) >>:US1EI == 'r[ N ; ] t N DROPB
( BS1( t ) does not affect the return stack top neither in
"forwards" execution nor when backtracking)
Statement.4. (Equivalent formulation for the definition of BS1( t ) )
t:RE => %A N : %A x:US1:R :
N [r BS1( t:RE ) x:US1:R ; '] == [r t N x DROP ; '] N
Proof.
( => )
Let us show that the statement is true in view of definition of
BS1( t ). Let us consider arbitrary N and x:US1:R .
At first, we rewrite:
'r[ >> ] N BS1( t ) x ; == [r N BS1( t ) x ; '] r>>
Then
'r[ N ; ] t:RE N DROPB x ; ==
== [r [r t N x 'r[ DROP ; ] ; '] N ; '] r>> ==
== [r [r t N x DROP ; '] N ; '] r>> ==
== [r t N x DROP ; '] N r>>
One can mention that in view of definition of BS1( t ) these two
expressions are equivalent.
( <= )
Let us show that the definition of BS1( t ) may be derived from
the statement of the theorem. Let us consider arbitrary N , r>>
and >>:REI:US1EI. Then
N [r BS1( t ) [r >> ']:US1:R ; '] r>> ==
== [r N BS1( t ) >> '] r>> == 'r[ r>> ] N BS1( t ) >>
[r t N [r >> ']:US1:R DROP ; '] N r>> ==
== [r t N 'r[ DROP ] >> '] N r>> ==
== 'r[ N r>> ] t N DROPB ==
== 'r[ r>> ] 'r[ N ; ] t N DROPB
Since we assume that the statement of the theorem is true, these
two lead us to the original definition of BS1( t ).
BS[m]( t ) is defined by analogy:
Def.7. BS[m]( t:RE ) is such TCF that
%A N1 ... Nm : %A x:US[m]:R :
N1 ... Nm [r BS[m]( t ) x:US[m]:R ; '] ==
== [r t N1 ... Nm x (m){ DROP } ; '] N1 ... Nm
Def.8. t:BS if BS1( t ) == t
Def.9. BS[-1]( t:RE ) is such TCF that
BS1( BS[-1]( t ) ) == t
Note. BS[-1]( t ) may exist not for any t .
("reversal operation" for BS1( t ) )
Def.10. BS[-m]( t ) is such TCF that
BS[m]( BS[-m]( t ) ) == t
if t:BS[m] , i.e. %E y : t == BS[m]( y ) , then y == BS[-m]( t ) .
4.3 PROPERTIES OF S[m]B
Note that S1B( t ) does not obligatory exist.
Statement.5. (b) S1B( S1B( t:RE ) ) == S2B( t )
Proof.
%A N, M :
(e) S1B( S1B( t ) ) 'r[ N ; ] 'r[ M ; ] ==
== 'r[ N ; ] S1B( t ) 'r[ M ; ] ==
== 'r[ N ; ] 'r[ M ; ] t
The proofs of the following statements are analogous to those of
the similar statements for S[m].
Statement.6. (b) t == S1B( t ) => %A m : t:S[m]B
Statement.7. %A N : (b) DROPB S1B( t ) 'r[ N ; ] == t
Statement.8. t:S1B <=> %A N : (b) DROPB S1B( t ) 'r[ N ; ] == t
Consequence. %A N : (b) S[-1]B( t ) == DROPB t 'r[ N ; ]
Statement.9. S[-1]B( S1B( t ) ) == t
Statement.10. (Comparison by S[-1]B )
Let t:RE:S1B, u:RE:S1B
S[-1]B( t ) == S[-1]B( u ) <=> t == u
Statement.11. (Comparison by S1B )
Let t:RE, u:RE
S1B( t ) == S1B( u ) <=> t == u
Statement.12. If S1B( t ), S1B( u ) exist, then
S1B( t ) S1B( u ) == S1B( t u )
Consequence. [ t:S1B u:S1B ]:!S1B
Statement.13. u:RE:S1B, v:RE:S1B =>
(b) S[-1]B( u ) S[-1]B( v ) == S[-1]B( u v )
Statement.14. (b) S[-1]B( S[-1]B( t:S2 ) ) == S[-2]B( t )
Def.1. S[0]B( t ) == t
Statement.15. For any integer m and n
S[n]( S[m]( t ) ) == S[n+m]( t )
If the system contains additional stacks, one can prove similar
statemnts for them, too.
Statement.16. (b) S[-1]B( S[-1]( t:RE ) ) == S[-1]( S[-1]B( t ) )
Proof.
(b) DROPB S[-1]( t ) 'r[ N ; ] ==
== DROPB M t DROP 'r[ N ; ] ==
== M DROPB t 'r[ N ; ] DROP ==
== S[-1]( S[-1]B( t ) )
Statement.17. (b) S1( S1B( t ) ) == S1B( S1( t ) )
Proof.
(in view of the previous theorem)
(b) S[-1]( S[-1]B( S1B( S1( t ) ) ) ) ==
== t == S[-1]B( S[-1]( S1( S1B( t ) ) ) ) ==
== S[-1]( S[-1]B( S1( S1B( t ) ) ) )
Now in view of the theorems of comparison by S[-1] and S[-1]B
(b) S1( S1B( t ) ) == S1B( S1( t ) )
Statement.18. t:R:!SB
i.e. (b) S1B( t:R ) == t
Proof.
(b) 'r[ N ; ] t == t 'r[ N ; ]
4.4 PROPERTIES OF BS[m]
Statement.1. (b) BS1( BS1( t ) ) == BS2( t )
Proof.
%A N : %A M :
(b) N M BS1( BS1( t ) ) x:US2:R ; ==
== N 'r[ M ; ] BS1( t ) 'r[ DROP ; ] M x ; ==
== 'r[ M ; ] 'r[ N ; ] t 'r[ 2DROP ; ] N M x ;
Statement.2. (b) t == BS1( t ) => %A m t:BS[m]
Proof.
%A m
(b) t == BS1( t ) == BS1( BS1( t ) ) == ... == BS[m]( t )
Statement.3. %A N : (b) t:RE == N DROPB BS1( t ) BDROP
Proof.
%A N
(b) N DROPB BS1( t ) BDROP ==
== 'r[ r>> ] 'r[ DROP ; ] N BS1( t ) >R 'r[ R> ; ] >>:REI ==
== [r 'r[ DROP ; ] N BS1( t ) >R 'r[ R> ; ] >> '] r>> ==
== [r [r N BS1( t ) >R 'r[ R> ; ] >> '] DROP ; '] r>> ==
== [r [r N BS1( t ) >R [r >> '] R> ; '] DROP ; '] r>> ==
== [r [r t N S1( [r >> '] ) DROP ; '] N DROP ; '] r>> ==
== [r [r t [r >> '] ; '] ; '] r>> ==
== 'r[ r>> ] [r t [r >> '] ; '] ; ==
== 'r[ ; ] t:RE 'r[ ; ] >>:REI == t
Statement.4. %A t:RE : t:BS1 <=> %A N : (b) BS1( N DROPB t BDROP ) == t
Proof.
( <= ).
%E u == [ 0 DROPB t BDROP ] : (b) BS1( u ) == t
=> t:BS1
( => ).
(b) BS1( N DROPB t:BS1 BDROP ==
== BS1( N DROPB BS1( BS[-1]( t ) ) BDROP ) ==
( BS[-1]( t ) exists, since t:BS1 )
== BS1( DROPB 'r[ N ; ] BS[-1]( t ) N DROPB BDROP ) ==
== BS1( BS[-1]( t ) ) == t
Consequence.
From the statement of the theorem and the definition of
BS[-1]( t ) it follows that
%A N : (b) BS[-1]( t:RE ) == N DROPB t BDROP ,
and vice versa---the statement of the theorem follows from this
statement.
Statement.5. BS[-1]( BS1( t:RE ) ) == t
Proof follows from the two previous theorems.
Statement.6. ( Comparison by BS[-1] )
Let t:RE:BS1, u:RE:BS1 .
Then (b) S[-1]( t ) == S[-1]( u ) <=> (b) t == u
Proof.
( <= )
It is evident. We understand b-equivalence of TC fragments so that
it cannot be not true.
( => )
(b) BS[-1]( t ) == BS[-1]( u ) =>
(b) BS1( BS[-1]( t ) ) == BS1( BS[-1]( u ) ) =>
(b) t == u
Statement.7. ( Comparison by BS1 )
(b) BS1( t ) == BS1( u ) <=> (b) t == u
Proof.
( <= )
By understanding of TCF equivalence.
( => )
(b) t == BS[-1]( BS1( t ) ) == BS[-1]( BS1( u ) ) == u
Statement.8 If BS1( t ), BS1( u ) exist, then
BS1( t u ) == BS1( t ) BS1( u )
Proof.
(b) N BS1( t ) BS1( u ) BDROP ==
== 'r[ N ; ] t N DROPB BS1( u ) BDROP ==
== 'r[ N ; ] t DROPB 'r[ N ; ] u N DROPB ==
== 'r [ N ; ] t u N DROPB
== N BS1( t u )
Consequence. [t:BS1 u:BS1]:!BS1
Statement.9. [r BS1( t:RE ) S1( u:R ) ; '] == S1( [r t u ; '] )
Proof.
N [r BS1( t ) S1( u ) ; '] ==
== [r t N S1( u ) DROP ; '] N ==
== [r t u '] N == N S1( [r t u ; '] )
Statement.10. [r BS1( t:RE ) u:US1:R ; ']:!US1
Proof.
%A N :
N [r BS1( t ) u ; '] == [r t N u DROP ; '] N ==
== [r t N u DROP ; '] N == [r t N u DROP ; '] N DROP N ==
== N [r BS1( t ) u ; '] DROP N
Statement.11.
u:BS1:RE, v:BS1:RE =>
(b) BS[-1]( u ) BS[-1]( v ) == BS[-1]( u v )
Proof.
%A N %A >>:S1EI :
[r BS[-1]( u v ) >> '] r>> ==
== [r N u v [ BDROP >> ]:S1EI '] r>> ==
== [r N u [ v BDROP >> ]:S1EI '] r>> ==
== [r BS[-1]( u ) N DROPB [ v BDROP >> ]:S1EI '] N r>> ==
== [r BS[-1]( u ) [r N v BDROP >> '] DROP ; '] N r>> ==
== [r BS[-1]( u ) [r BS[-1]( v ) N DROPB BDROP >> ']
N DROP ; '] N r>> ==
== [r BS[-1]( u ) BS[-1]( v ) >> ] N r>>
Statement.12. If exists S1B( S1( t ) ) , then S1B( S1( t ) ) == BS1( t )
Note. As has been shown above, S1B( S1( t ) ) == S1( S1B( t ) ) .
Proof.
%A N : %A >>:US1EI
(b) N S1( S1B( t ) ) ==
== S1B( t ) N >>:US1EI ==
== S1B( t ) N [r >> ']:US1 DROP N ; ==
== S1B( t ) 'r[ N ; ] DROPB N >> ==
== 'r[ N ; ] t DROPB N >> ==
== BS1( t )
Statement.13. BS1( t:R ) == S1( t )
Proof.
S1( t:R ) == S1( S1B( t ) ) == BS1( t )
Statement.14. BS1( t:REI ) == BDROP t
Proof.
The definition of BS1 is:
(b) N BS1( t ) >>:US1EI == 'r[ N ; ] t N DROPB
Let us show that BDROP t satisfies it.
At first, since t:REI, %A z : [ t z ] == t . Therefore
'r[ N ; ] t N DROPB == 'r[ N ; ] t
and since BDROP == >R 'r[ R> ; ] , and [r t:REI ']:R
(b) 'r[ N ; ] t == N BDROP t
4.5 PRACTICALLY IMPORTANT STATEMENTS ABOUT 2-ND ORDER TCFs
One more stack (the continuation stack, or L-stack) is used to
implement backtracking. For brevity, we omit definitions of
L[m]( t ) , BL[m]( t ) , t:L , etc., since they are the same as
corresponding definitions for data stack ( S[m]( t ) ,
BS[m]( t ) , t:S and so on).
In [Gas95] we have showed that PRO == R> >L 'r[ LDROP ; ] where PRO
is defined as
: PRO R> R> >L ENTER LDROP ;
The word PRO is used in BacFORTH [Gas94] to complements a 1-st order
call (i.e. ordinary one) to 2-nd order call (that is used for
backtrackable code). The word CONT performs a success (a call of
continuation) using the continuation address placed onto the L-stack
by PRO.
Statement.1. CONT == 'r[ >> ] L@ BLDROP >R ;
Proof.
By definition:
CONT == [r L> >R R@ ENTER R> >L ; '] ==
== 'r[ >> ] L> >R R@ 'r[ R> >L ; ] >R ;
BLDROP == L> >R 'r[ R> >L ; ]
'r[ >> ] L@ BLDROP >R ; ==
== 'r[ >> ] L@ L> >R 'r[ R> >L ; ] >R ; ==
== 'r[ >> ] L> >R R@ 'r[ R> >L ; ] >R ; ==
== CONT >>
Statement.2. (b) [r PRO BL1( t ) CONT ; '] == t
Proof.
(b) [r PRO BL1( t ) CONT ; '] ==
== 'r[ >> ] R> >L LDROPB BL1( t ) CONT ; ==
== 'r[ >> ] R> >L LDROPB BL1( t ) CONT ; ==
== 'l[ >> ] LDROPB BL1( t ) 'r[ ; ] L@ BLDROP >R ; ==
== BL[-1]( BL1( t ) ) >> == t
(2nd order call of BL1( t ) (a TCF which is "almost equivalent" to
t but which does not affect (nor depends on) the L-stack top) is
b-equivalent to t substituted inline).
Consequence. (b) [r PRO t:L CONT ; '] == t
(If a TCF does not depend on the L-stack, its 2nd order call is
b-equivalent to its inline substitution)
At this point we finish our equivalence-based formal description of
control flow effects of return stack manipulations. We could not only
describe traditional the control structures (call, EXIT, BRANCH,
?BRANCH) [Gas95], but to build an adequate description of such untrivial
execution method as backtracking (constructing 2nd order threaded code,
we implement backtracking). The cut statements, as well as the RP@ and
RP! operators which access the return stack pointer directly (and
therefore treat the return stack not as a stack) remain unexplored.
Description of the 3rd order threaded code, which is found useful when
the potentialities of classical backtracking are exhausted, remains
beyond the scope of this work as well.
Nevertheless, what has been done is sufficient to state that our system
of notions copes well with the task.
5 THE REPEATABILITY PRINCIPLE
5.1 FORMULATION
We want to present one more axiom which has not been found necessary so
far.
Axiom. (Repeatability principle in case of no interaction with external
world)
%A << %A >> %A t %E x,z :
t == x t
z DROP == NOOP
t == x t DROP z
This means that given the memory contents before execution of t, any
value that t leaves on the stack may be calculated one more time (since
we can keep the necessary information in a location unused by other
code; the principle states that for any program such place can be
found). This axiom enables us to name values in the stack (this could
not be done by locals because each particular implementation of locals
has its restrictions: scope, etc.).
Note. The context ( << and >> ) may pass control to t many times;
to avoid confusion, a notation that resembles locals can be introduced.
Axiom. (Repeatability principle in general case)
%A << %A >> %A t %E t1..tn %E x0..xn ,z0..zn :
t == t1 .. tn
t == x0 t1 x1 t2 x2 .. tn xn
%A i=0..n : zi DROP == NOOP
t == x0 z0 DROP t1 x1 t2 x2 .. tn xn
t == x0 t1 z1 DROP x1 t2 x2 .. tn xn
....
t == x0 t1 x1 t2 x2 .. tn xn zn DROP
This means that given the memory statebefore execution and results of
all interactions with the external world, one can predict any value that
can be placed on the stack anywhere in the program.
5.2 DISCUSSION OF THE REPEATABILITY PRINCIPLE
Thus, the repeatability principle enables us to introduce notions
equivalent to the notion of a value of the stack. In other words, it
enables a theorist who knows axioms but nothing about the stack to infer
that values on the stack exist. Although it postulates an important
property of the threaded code, we could do without it so far.
We did not use the repeatability principle intentionally: it
significantly differs from other axioms. All other axioms describe
properties of the equivalence and do not introduce new objects.
Subdivision of t into t1...tn and finding x0...xn and z0...zn is not a
trivial operation: it requires understanding of the program as a whole.
It states a property of program, not a property of a local piece of
code. It is not an elementary property.
Finally, the repeatability principle looks like a property very specific
to executable code. Although we do not know of another object with
similar properties, we would like to avoid making our system so
specific.
5 DISCUSSION
The effects of return stack manipulations may be statically predicted.
One of the most important properties of code fragments with respect to
the return addresses is to leave them unchanged. In particular, this
enables us to convert TC fragments to linear form which is often the
most convenient form for the analysis: indeed, it is easier to follow
the course of linear execution. However, practice has showed that this
form is not well-suited for most problems: it is easier to cope with
programs subdivided into units which reflect our concepts on the
problem, and creation of an interpreter of these natural units pays for
itself. This idea may be found behind such approaches as
macroprocessing, problem-oriented languages, use of procedures,
backtracking, data-driven approach. So, it is important to find the most
natural notions, most natural syntax, and be able to construct an
interpreter for this syntax (for a complex problem or for a class of
similar problems).
It is evident that a backtrackable procedure uses its continuation as a
parameter (it calls it when it succeeds). It is less evident that any
Forth procedure uses its continuation as a parameter. Indeed, EXIT
ignores its continuation, BRANCH fetches a destination address from it,
while most procedures pass control to it.
We have introduced the notion of 2nd order threaded code. The 2-order
threaded code (i.e. backtrackable code) enables one to implement a
syntax that resembles a dataflow diagram:
To implement backtracking we have used the "success as continuation
call" method. In the same way, via continuation call, one can build 3rd
and higher order threaded codes. A 3-order procedure success will
consist in calling continuation as a fragment of 2-order code that ends
with CONT EXIT (code that performs success). A 3 order procedure is
described by 8 stack states, which is 1 more that the mean psychological
constant, but it is not a problem for a person who can use backtracking
(the trick is to have most of the states empty).
The 3 order threaded code enables one to program in terms of
backtrackable (bi-directional) dataflows, i.e. enables one to return to
this convenient code arrangement when the capabilities of backtracking
have already been exhausted. The only real problem where it has happened
is an experimental program to reconstruct the syntax tree for a sentence
in an inflective natural language (namely, Russian).
6 RESULTS
We have shown that in most cases the effects of return stack
manipulations may be statically predicted.
The concept of position (active, passive, etc.) is introduced that
represents the differences between code, data, self-modifying code, etc.
A notation has been developed that adequately describes processing of
[threaded] code by the code interpreter and effect of return address
manipulations on it. Formal description of how return addresses
manipulations are reflected in the control flow changes is given. The
calculus enables one to analyse return addresses manipulations, prove
their correctness, perform equivalent transformations of code fragments.
The calculus have been successfully used to describe changing to a
different execution mechanism--backtracking.
Notions of 2nd and 3rd order codes have been introduced. The 3-order
threaded code can help in cases where the capabilities of backtracking
are exhausted.
The theorem of 2-order call equivalence has been proved. The theorems of
1st and 2nd order call equivalence provide a clue for construction of
optimizing Forth compilers.
REFERENCES
[Gas93R] Gassanenko M.L.
Novye sintaksicheskie konstruktsii i be`ktreking dlja jazyka
Fort.//Problemy tehnologii programmirovanija - SPb: SPIIRAN,
1991-93, p.148-162. (New Control Structures and Backtracking
for the Forth Language, in Russian.)
[Gas94] Gassanenko, M.L. BacFORTH: An Approach to New Control
Structures. Proc. of the EuroForth'94 conference, 4-6 November
1994, Royal Hotel, Winchester, UK, p.39-41.
[Gas95] Gassanenko, M.L. Formalization of Return Addresses
Manipulations and Control Transfers. Proc. of the EuroForth'95
conference, 27-29 October 1995, International Centre for
Informatics, Dagstuhl Castle, Germany.
[Poi91] Poial J. Multiple Stack-effects of Forth Programs. 1991 FORML
Conference Proceedings, euroFORML'91 Conference, Oct 11-13,
1991, Marianske Lazne, Czechoslovakia, Forth Interest
Group, Inc., Oakland, USA, 1992, 400-406.
[Poi93] Poial J. Some Ideas on Formal Specification of Forth Programs.
9th euroFORTH conference on the FORTH programming language and
FORTH processors, Oct 15-18, 1993, Marianske Lazne, Czech
Republic, 1993, 4 pp.
[Poi94] Poial, Jaanus. Forth and Formal Language Theory. Proc. of the
EuroForth'94 conference, 4-6 November 1994, Royal Hotel,
Winchester, UK, 6 pp.
[StK92] Stoddart B., Knaggs P. The (Almost) Complete Theory of Forth
Type Inference. Proc. EuroForth Conference 1992.
[StK93] Stoddart B., Knaggs P. Type Inference in Stack Based Languages.
Formal Aspects of Computing, BCS, 1993, 5, 289-298.
[Sto93] Stoddart B. Self Reference and Type Inference in Forth. 9th
euroFORTH conference on the FORTH programming language and
FORTH processors, Oct 15-18, 1993, Marianske Lazne, Czech
Republic, 1993, 9 pp.