Scheme Operational Semantics

Lecture Notes on
Scheme Operational Semantics

To provide an operational semantics of the Scheme language means to formally describe the process of evaluation for its expressions. The most formal and precise way to do that is by means of a formal system, as done in the paper "Structured Operational Semantics of a fragment of the language Scheme" (part of the reading material of the course.)
In the present notes we provide a precise but slightly informal description that can be used as an introductory reading before studying the above mentioned paper.

Being Scheme a functional programming language, one could think that its semantics could be described simply in terms of notions like beta-reduction, strategy of reduction, value ecc.
Actually it is not so. This is partly due to the presence in Scheme of the define expressions (not present in the lambda-calculus and harmfully related to the notion of "modification"). Moreover, we need to take into account the fact that we need efficient implementations of the beta-reduction, and this cannot but affect the formal operational semantics of the language.
So, we need to introduce a few concepts less abstract than the ones present in the heavenly world of lambda-calculus (as said before, concepts connected also to implementation-related matters.)

A frame is a finite set of associations between variable names and values.
An environment is a set of frames, connected in a tree structure by means of pointers.
Every frame has a pointer to its father frame. The frame at the root of the tree structure is called top-level frame.
These notions are related to implementation matters and mainly needed to efficiently implement the operation of substitution (and hence the beta-reduction).

When a Scheme session is opened, the environment consists only in the top level frame, containing the default associations for the predefined symbols. This environment will be possibly modified by the evaluation of the Scheme expressions during our session.
It is strange to have to consider the notion of modification, a typically imperative concept, in a functional language, but it is necessary in order to avoid making the implementation of the language too complex. Of course this can create strange results in the evaluation of expressions if we are not aware of it.

When the interpreter evaluates a define expression, the evaluation is performed in the current frame (the top level one in case we have just opened the Scheme session). The evaluation of a define expression modifies the current frame.
Let us assume, for example, that the frame F₂ is the current frame and that we wish to evaluate an expression (define var exp)
What happens is that exp is evaluated, let v be its value, and that the current frame is modified by inserting the new association var -> v.

Example:

 (define c (+ 5 3)) 
 (define b (= c 8)) 
 (define c 2)  
 (= c 8) 
 > #f

The evaluation of the expression (+ 5 3) produces the value 8, which is associated to the variable c.
Let assume that at the beginning of the example the current frame was F₂ and that it was empty. The evaluation of the first define modifies F₂ by adding the association c -> 8:

F₂-------- 
| c -> 8 | 
----------

The expression (= c 8) is evaluated, giving true as value (we need to distinguish between the expression #t and its value), and the frame is modified by adding b -> true:

 F₂------------ 
 | c -> 8     | 
 | b -> true  | 
 --------------

(define c 2) is evaluated and since c is a variable already present in this frame, it is overwritten (we need always to be aware that "overwriting" can have unpredictable side-effects).

 F₂------------ 
 | c -> 2     | 
 | b -> true  | 
 --------------

If we now evaluate again (= c 8) its value is obviously "false". So, the value of this expression depends on when it is evaluated, not a very nice thing indeed!

Notice that, if we write b on the prompt, we obtain "true" as value. In fact the evaluation of the value to associate to b is done only once, when the interpreter evaluate the expression (define b (= c 8)); from that moment on the value of b will be true (unless we "overwrite" the variable b).
If we now try to evaluate the expression (= t 8), the interpreter looks for an association for t in F₂. If it does not find it, it looks on the father frame and so on, possibly up to the top level frame. If it does not find any association for t, an error is returned.

The value of functions, implementation of beta-reduction, scoping problem, closures and evaluation of applications

We said that, in an environment, values are associated to identifiers. We know that for Scheme an abstraction is a value; So the value of a lambda-expression is the lambda-expression itself.
Then, if we evaluate

(define (f x) (+ 3 x))

(recall that this definition is equivalent to (define f (lambda (x) (+ 3 x))) ) )
the following association is inserted in the current frame F:
f --> value-of-(lambda (x) (+ 3 x)).
So, according to the fact that a lambda expression is a value (and hence cannot be further evaluated) such association is
f --> (lambda (x) (+ 3 x))

If we now evaluate

 (f 5)

the interpreter looks in the environment for an association for the symbol f and finds the lambda-expression (lambda (x) (+ 3 x)). Now, since we get immediately the value of the argument of f, 5, we expect the interpreter to perform the beta reduction of the redex
((lambda (x) (+ 3 x)) 5)
and, afterwards, the evaluation of
(+ 3 5)
( + is a predefined function and the interpreter will call an inner procedure in order to compute 3 + 5).

Since substitution is an heavy operation to implement, here is what Scheme does instead of actually performing a beta-reduction:
it simply extends the current frame with a new temporary frame containing the associations formal parameters/actual parameters of the function, and then evaluates the body of the function using this new frame as current frame.
In our case it would evaluate the expression (+ 3 x) in the frame F₁

 |                         |
 F-------------------------|
               ^
               |
        F₁------------
         | x --> 5   |
         -------------

You see, in this way we sort of perform the beta-reduction avoiding to actually perform the substitution.

Let us go back now to the value of lambda expressions. As a matter of fact the value of a lambda expression is something a bit more complex than simply the lambda expression itself. To explain why, let us consider the following example.

(define a 3) 
(define (f x) (+ a x)) 
(define (g y) 
    (let ((a 2)) 
      (f y))) 
(g 5)

According to what we said up to now, f should now be associated to the "value " (lambda (x) (+ a x)).
You can notice that an association for the variable "a" has been created before the definition of f. Besides, in the body of the function g, "a" is a local variable with value 2.
Now, in the definition of a language one needs to make a choice: when evaluating (g 5) (and hence its body, containing the call (f y)), what do we wish the value of the variable 'a' to be?
In other words, what is the value of the free variables in the body of a function?

First choice: the value they had when we defined the function.

Second choice: the value they have at the moment the function is called with an actual argument.

Making the first choice means that we wish our language to have static scoping.
Making the second one means that we wish our language to have dynamic scoping.
Those who defined the language Scheme made the first choice (the same choice of most the languages; the first version of LISP, instead, had dynamic scoping).
This is a good choice, otherwise it would be difficult to understand what is going on in a program. Besides, the value of an expression like (f 5) would not be the same everywhere in a program.

A language with static scoping implies however some machinery for its realization.
In particular it is necessary to devise a way to remember the values associated to the free variables in a function when it is defined. So, we need to remember not only the lambda expression corresponding to a function, but also the frame in which the function has been defined.
It is possible to do that by simply considering, as value of a lambda expression, not only the lambda expression itself, but also the frame in which the lambda expression is evaluated. For efficiency reasons, instead of the frame we consider a pointer to it.
A pair <lambda-expression, pointer-to-frame> is called a closure.
So, in Scheme the value of a lambda-expression, is not, as we have wrongly said up to now, the lambda expression itself, but it is a closure: <lambda-expression, p>, where p is the pointer to the frame where the lambda-expression is evaluated. Closures belong to the set of Scheme values.
Pointers present in closures are used when the interpreter evaluates an application.

The implementation of static scoping by means of closures works, roughly, as follows: in order to evaluate an application (e₀ e₁...e_m) where the value of e₀ is a closure, the interpreter evaluates the body of the function using the pointer in the closure, and taking into account the associations formal parameters -> actual parameters.

A bit more formally:

The value of lambda expressions:

The value of a lambda-expression (lambda ...) is a closure. If (lambda ...) is evaluated when p is the current frame pointer, the value of (lambda ...) is the closure <(lambda ...), p >.

The value of applications:

When the interpreter finds an expression like (e₀e₁ e₂...e_m) it first checks if e₀ is a keyword. In case not, it evaluates e₀ in the current frame. Its value must be either a predefined function or a closure. Then the expressions e₁ ... e_m are evaluated.
In case the value of e₀ is a closure <(lambda (x₁...x_m) body),p>, the value of the application is obtained in the following way. A new frame, containing the associations x_i -> v_i is added under the frame pointed by p, where v_i is the value of e_i. Let p' be the pointer to this new frame. The value of the application is the value obtained by evaluating body using p' as current frame pointer. After the evaluation of body, the current frame pointer turns back to be p again. Notice that the interpreter also checks if the number of the arguments matches the arity of the function.

Let us see, step by step, what really happens during the evaluation of the previous example.
We are in the frame F, pointed, say, by p₁. So p₁ is the current frame pointer. When the expression

 (define a 3)

is evaluated we get

  |  a --> 3   |<---p₁
  F-------------

Then when we evaluate

 (define (f x) (+ a x))

the frame F is modified as follows

 |  a --> 3                             |
 |  f --> <(lambda (x) (+ a x)), p₁>    |<--- p₁
 F---------------------------------------

where the current frame pointer is still p₁.
Then the evaluation of

 (define (g y) 
     (let ((a 2)) 
       (f y)))

modifies F as follows

| a --> 3                                       |
| f --> <(lambda (x) (+ a x)), p₁>              | 
| g --> <(lambda (y) (let ((a 2)) (f y))), p₁>  | <---
F------------------------------------------------    p₁

Now, if we evaluate

 (g 5)

in F (that is using p1 as current frame pointer), the interpreter first evaluates the identifier "g". Its value is found in F: the closure <(lambda (y) (let ((a 2)) (f y))), p₁>.
Then the interpreter evaluates the expression "5", whose value is, surprise surprise, 5.
Now the interpreter should perform the beta-reduction and go on evaluating the contractum in the frame in which g has been defined, which, in this case is F itself, pointed by p₁.
The evaluation of the contractum is performed as follows.
The interpreter extends the frame pointed by p₁ with a new frame containing the associations formal-parameters/actual-paramenters. So we get

|  a -- 3                                        |
|  f --> <(lambda (x) (+ a x)), p₁>              | 
|  g --> <(lambda (y) (let ((a 2)) (f y))), p₁>  | <--- p₁
F-------------------------------------------------    
                          ^
                          | p₁
                    --------------
                    |  y --> 5   | <---- p₂
                    F₁------------

Now, using p₂ as current frame pointer, the interpreter evaluates the body of g, that is it evaluates

 (let ((a 2)) 
   (f y))

We know (rsSee Notes on let) that evaluating this expression is like evaluating

 ((lambda (a) (f y))  2)

So, since the evaluation of (lambda (a) (f y)) in p₂ is the closure <(lambda (a) (f y)), p₂>, we get

|  a --> 3                                      |
|  f --> <(lambda (x) (+ a x)), p₁>             | 
|  g --> <(lambda (y) (let ((a 2)) (f y))), p₁> | <--- p₁ 
F------------------------------------------------
                        ^
                        | p₁
                  --------------
                  |  y --> 5   | <--- p₂
                  F₁------------
                        ^
                        |p₂
                 --------------
                 |  a --> 2   | <--- p₃
                 F₂------------

We have now to evaluate (f y) using now p₃ as current frame pointer.
The interpreter first looks for the value of the identifier "f" in F₂. It does not find it, so goes up and looks for it in F₁. Again, it does not find it, so goes up and looks for it in F. Here it gets the value for "f", which is <(lambda (x) (+ a x)), p₁>.
Then the interpreter evaluates "y" in F₂. Since there exists no association for "y" in F₂ the interpreter goes up and find an association for it in F₁.
So, the value of "y" is 5.
Now the interpreter look at the pointer of the closure <(lambda (x) (+ a x)), p₁> and extends the frame pointed by p1 with a new one, containing the association x --> 5.
So we get

|  a --> 3                                      |
|  f --> <(lambda (x) (+ a x)), p₁>             | 
|  g --> <(lambda (y) (let ((a 2)) (f y))), p₁> | <--- p₁ 
F------------------------------------------------
              ^                       ^
              | p₁                    | p₁
        --------------          --------------
        |  y --> 5   | <--- p₂  |  x --> 5   | <--- p₄
        F₁------------          F₃------------
              ^
              |p₂
       --------------
       |  a --> 2   | <--- p₃
       F₂------------

In F₃ (that is using p₄ as current frame pointer) the interpreter now evaluates the body

 (+ a x)

+ is a predefined operator, so the interpreters starts evaluating its arguments. It looks for an association for "a" in F₃. Not finding any, it goes up and finds an association for "a" in F. So the evaluation of "a" in F₃ returns the value 3.
The evaluation of "x" in F₃ returns 5.
Hence the value of (+ a x) in F₃ is 8, which is then the value of (g 5) in F.

You see how, by means of pointers in closures, we managed to evaluate the body of f considering as value of "a" the one associated to it when f has been defined (even if, at the moment of the evaluation of (f y) in F₂, "a" is associated to the value 2.)
When the evaluation of (g 5) terminates, that is after the evaluation of (+ a x) in F₃, the current frame pointer turns back to be p1 again.

At the end of the evaluation of (g 5) the temporary frames used during the evaluation of (g 5) are logically "destroyed". More simply, any reference to the temporary frames is lost (the pointers p₂, p₃ and p₄ are not usable anymore). The Garbage Collection process will realize later on that some frames are not accessible anymore (i.e. no usable pointer points at them) and hence it will "discard" them (in implementation terms it will deallocate the memory they occupy).

NOTE: During the evaluation of an expression it is often the case that a current frame pointer p changes, in order to complete the evaluation of some subexpressions, and then turns back to be p again. It is clear that, in order to keep track of what is the current frame pointer and which pointer will turn back to be the current one after an evaluation, a data structure is needed: a "Stack of pointers". The upper one being the current frame pointer, while those below it being the ones that will turn back to be the current one after the various evaluations.

Conditionals

Let us see how conditional expressions are evaluated in Scheme.

 (if exp exp₁ exp₂)

The interpreter evaluates exp; if its value is different from false it evaluates exp₁ and returns its value as the value of the whole conditional expression. Otherwise it evaluates exp₂ and returns its value.
Notice that, with this semantics, anything different from false is supposed to be true. This is not a very clever thing to do. It could lead to subtle and harmful errors.
For example, the evaluation of (if 3 2 5) returns 2.

 (cond (test₁ exp₁)...(test_n exp_n) (else exp))

A cond expression can be looked at as a series of nested if. Hence its operational semantics can be given in terms of if.

Let and Letrec

For what concerns the let expression

 (let ((var₁ exp₁) 
           ... 
       (var_n exp_n)) 
    exp)

As it is explained in the Notes on let, the operational semantics of let expressions can be given in terms of application and lambda abstraction. So, there is no need to describe explicitely how a let expression is evaluated.

There is a version of lec, however, that cannot be described in terms of other expressions, it is the letrec.
In order to introduce letrec, let us assume that we wish to use, in an expression exp, a function f that never terminates (something silly to do, but it is an example). Let us assume also that we wish to use f only inside exp, that is we wish to use f locally.
Then we can write the following expression

 (let ((f  (lambda (x) (f x)))
    (f 3))

So, everything looks ok, since f is a function that keeps calling itself recursively forever. At least so it seems to us.
Let us check what really happens when we evaluate such an expression: by the semantics of the let expression, the evaluation of

 (let ((f  (lambda (x) (f x)))
    (f 3))

corresponds to the evaluation of

 ( (lambda (f) (f 3)) (lambda (x) (f x)) )

Let us assume that p is the current frame pointer. What happens when this expression is evaluated? We first evaluate (lambda (f) (f 3)) Its value is the closure <(lambda (f) (f 3)), p>.
Then we evaluate the argument (lambda (x) (f x)) Its value is <(lambda (x) (f x)), p>
Now we can proceed to the evaluation of the application by extending the frame pointed by the pointer in the closure (lambda (x) (f x)) with a new temporary frame, as follows

 |                      | <--- p
 F----------------------
            ^
            |
            |
   ---------------------------------
   | f --> <(lambda (x) (f x)), p> | <--- p₁
   F₁-------------------------------

The new temporary frame, which is now the current frame, contains the association between the formal and the actual argument of the function. In our case such association is between the local variable f of the let expression and its value. What it is relevant to notice is that the value associated to f has been evaluated using p as current frame pointer, that is the current frame pointer used for the evaluation of the let expression itself.
Now the interpreter goes on and evaluates the body of the let, (f 3), using p₁ as current frame pointer.
Since the value of f in p₁ is the closure <(lambda (x) (f x)), p>, the interpreter extends the frame pointed by p with the frame containing the association of x (formal paramenter of the function) to 3 (actual paramenter).

 |                      |<--- p
 F----------------------
     ^    ^                          ------------
     |    |__________________________| x --> 3  |<--- p₂
     |                               F₂----------
   ---------------------------------
   | f --> <(lambda (x) (f x)), p> | <--- p₁
   F₁-------------------------------

The computation goes on with the evaluation of (f x) with p₂ as current frame pointer. It is easy to check now that when we evaluate (f x) in F₂ we get an error, since there is no value associated to f in F₂ and no value in F either!
So, this is not what we expected, but it is how a let expression is evaluated!
Things would have been different if the value to be associated to our local variable had been <(lambda (x) (f x)), p₁> instead of <(lambda (x) (f x)), p>.
In such a case the evolution of the frames during the evaluation would have been

 |                      |<--- p
 F----------------------
     ^    
     |    
     |   
   ---------------------------------
   | f --> <(lambda (x) (f x)), p₁>| <--- p₁
   F₁-------------------------------

and then

 |                      |<--- p
 F----------------------
     ^    
     |    
     |   
   ---------------------------------
   | f --> <(lambda (x) (f x)), p₁>| <--- p₁
   F₁-------------------------------
                ^
                |
           -----------
           | x --> 3 |<-- p₂
	  F₂----------

Notice how now, the evaluation of (f x) in F₂ would really lead to an infinite computation.

How can we achieve such an effect?
By using letrec instead of let!

 (letrec ((f  (lambda (x) (f x)))
    (f 3))

In the evaluation of a letrec expression, the expressions whose values will be associated to the local variables are evaluated in the very same frame that will contain the new local associations. That is why, by using letrec instead of let, we get p₁ instead of p in the closure associated to f.

Notice that, unlike the let, the behaviour of a letrec expression cannot be described in terms of other expressions. Hence its semantics has to be given explicitely.

In order to fix better what we have learned, let us have a look at another example. It is an example showing that, when a pointer p is in the closure associated to an identifier in a frame F, it is not always the case that p points to F itself.

Let us consider the following Scheme session.

(define f (lambda (y) y)) 
(define (g x) 
    (define h f) 
    (h x)) 
(g 1)

We assume that, before beginning the session, p₁ is the current frame pointer, pointing to the frame F₁.
After the evaluation of the first define we get

  p₁--> | f -> <(lambda (y) y), p₁> | 
        F₁---------------------------

The evaluation of the second define modifies F₁ as follows.

      |                                           |               
p₁--> | f -> <(lambda (y) y), p₁>                 | 
      | g -> <(lambda (x) (define h f) (h x)), p₁>| 
      F₁-------------------------------------------

At this point we evaluate the application (g 1).

Remark: Recall that when the interpreter evaluates an application (exp₁ exp₂), it does not expect exp₁ to actually be a lambda abstraction or a predefined function, but rather its value to be one of the two.

The value of g is a closure, the value of 1 is 1. The semantics of the application tells us that if the value of the first expression is a closure, the frame pointed by the pointer in the closure has to be extended with a new frame containing the associations "formal parameters -> actual parameters". Such a new frame becomes the current frame.

      |                                           |
p₁--> | f -> <(lambda (y) y), p₁>                 | 
      | g -> <(lambda (x) (define h f) (h x)), p₁>| 
      F₁------------------------------------------- 
                ^
                |
          F₂ -------- 
    p₂ --> | x -> 1 | 
           ----------

Now we evaluate g's body, that is

  (define h f) 
  (h x)

using p₂ as current frame pointer.
The evaluation of the local define modifies the current frame F₂ inserting the association "h -> f's value". Since f is a variable, the interpreter looks for an association for f in F₂. Not finding any, it looks for it in F₂.
So, the value of f is >(lambda (y) y), p₁> the evaluation of the local define modifies the environment as follows.

      |                                           | 
p₁--> | f -> <(lambda (y) y), p₁>                 | 
      | g -> <(lambda (x) (define h f) (h x)), p₁>| 
      F₁------------------------------------------- 
                ^
                |
          F₂ --------------------------- 
    p₂ --> | x -> 1                    | 
           | h -> <(lambda (y) y), p₁> |
           -----------------------------

You see, now we have F₂ as current frame, containing an association var -> closure, and the pointer in the closure does not point to F₂ but to another frame, F₁.
We now evaluate (h x) (we still have p₂ as current frame pointer).
The values of h and x are found in F₂ (the value of h is a closure and the value of x is 1). The closure associated to h contains the pointer p₁, pointing to F₁; so it is the frame F₁ that needs to be extended with a new frame containing the associations "formal parameters -> actual parameters".

      |                                           |
p₁--> | f -> <(lambda (y) y), p₁>                 | 
      | g -> <(lambda (x) (define h f) (h x)), p₁>| 
      F₁------------------------------------------- 
             ^                             ^
             |                             |
      F₂---------------------------    F₃-----------
p₂--> | x -> 1                    |    |  y --> 1  |<-- p₃
      | h -> <(lambda (y) y), p₁> |    -------------
      -----------------------------

Now, with p₃ as current frame pointer, we evaluate h's body, that is y.
Therefore the value of (h x) is 1, and hence the value of (g 1) is 1.

In the previous example, in order to introduce the local variable h, a let expression could be used instead of the inner define (let expressions introduce local variables).

(define (g x) 
    (let ((h f)) 
      (h x)))

We notice however that between the two constructions there is a difference: an internal define modifies a frame, while the let expression adds a new frame.

Let us see what would have happened in case we had used let.
When g's body (let ((h f)) (h x)) is evaluated in F₂ frame, instead of modifying F₂ by inserting the association for h, the interpreter would have extended F₂ with a new frame containing the association
h -> <(lambda (y) y), p₁>.

   
         ...
          ^
	  |
      F₂-------- 
p₂--> | x -> 1 | 
      ---------- 
          ^ 
          | 
      F₄---------------------------
p₄--> | h -> <(lambda (y) y), p₁> | 
      ----------------------------

In our example above the final result is identical if we use let or the inner define. There can be cases, however, in which modifying a frame and "attaching a new frame to it" can produce different results.

Exercise: Describe what happens in the evaluation of the following SCHEME session (i.e. describing the various frames, their modifications etc. along the lines of the examples above)
It is not difficult, but it is rather complex. Try to solve it after you have solved other similar exercises in the web page of exercises, where there is also the solution of the present one.

 (define (K x) 
    (lambda (x) 
       (+ (let ((x 1)
               (fun (lambda (y) (+ x 1))))
             (fun x))
          (+ x x))))
  
 (define t 44)
  
 (define x 8)
  
 (let ((x (+ x 2))
       (gg (lambda (y) t))
       (t 66)
       (fun 21))
   ((lambda (x) (+ (gg 1) fun))  ((K fun) x) ))

Temporary memorization of values (A thank to P. Giarrusso)
During the evaluation of a Scheme expression, in order to produce the final result, there could be the need of temporarily holding some values.
For instance, if the interpreter evaluates the expression

(define a ((lambda (x) (+ 1 x)) (- 3 2)))

the value of the expression ((lambda (x) (+ 1 x)) (- 3 2)), i.e. 2, will be associated in the environment to the identifier a.
The evaluation of this application will consist in first evaluating the lambda expression (lambda (x) (+ 1 x), then the expression (- 3 2), and lasty the application itself.
If the evaluazion is performed with n as current frame pointer, the value of (lambda (x) (+ 1 x) is <n,(lambda (x) (+ 1 x)> and the value of (- 3 2) is 1.
It is now obvious that the interpreter need to hold somewhere the two values <n,(lambda (x) (+ 1 x)> and 1 during the evaluation of the application (i.e. during the period when the frame pointed by n is extended with the new one containing the association x->1 and the value of the body of the function is evaluated in this new frame.)
The way how an interpreter temporarily holds values may vary, highly depending on the particular implementation of the interpreter and goes beyond the scope of the present notes (it is also something our formal semantics does not care about).
For what concerns ourselves it is enough to imagine the interpreter to have some sort of notebook where it temporarily holds values when needed, as in the example presented above.

A formal system for the description of the operational semantics of Scheme

Giving an operational semantics for a language corresponds, roughly, to explaining how the interpreter of the language works. Up to now, for Scheme, we have done it in a rather informal way.
A real formal system (i.e. a set of rules and axioms) for the operational semantics of Scheme is given in the paper Structured Operational Semantics of the language Scheme.. So, this paper is what must be studied (its first 10 pages, excluding the set! and begin operators).
In what follows we present a smooth introduction to the contents of the paper.
Notice that in this formal description of Scheme operational semantics, the rules implicitly describe also the reduction strategy used by the interpreter during the evaluation of expressions.

A formal description of Scheme operational semantics is useful in order to make the semantics itself non ambiguous, and in order to ease the implementation of the interpreter.
Besides, a description in terms of a formal system can be also used in order to prove properties of programs and of implementations of interpreters.

As stated in the introduction of the paper mentioned above, we can have two sort of operational semantics: Small Step and Big Step.
In a Small Step operational semantics the judgments of the formal system can be interpreted as

"The expression X can be reduced with a single step of computation to the expression Y."
Our operational semantics of Scheme is, instead, a Big Step semantics, that is a formal system whose judgments can be roughly interpreted as

"The expression X can be reduced in many computational steps to the value Y".

Let us first precisely describe the form of the judgments of our formal system.
We have actually two formal systems:

one for the evaluation of the define expressions and
one for the evaluation of the other kind of expressions.

We need two formal systems since a judgment about a define expression has to say different things with respect to a judgment about a non define expression. In fact when an interpreter evaluates a define expression it modifies a frame and does not return any value; When, instead, it evaluates an expression it can possibly modify the environment, but surely it returns a value.
Then for the define expressions the judgments are of the form:

zeta, n |- d -> zeta'

whose informal meaning is:

In an environment zeta, using n as current frame pointer,
the evaluation of the define expression "d" modifies the environment from zeta to zeta'

For the other expressions the judgments are of the form:

zeta, n |- e -> v, zeta'

whose informal meaning is

In an environment zeta, using n as current frame pointer,
the evaluation of the expression "e" returns the value v
and modifies the environment from zeta to zeta'

You see, we are giving a Big Step semantics: the judgment does not say that the expression "e" reduces with one computational step to the expression "e' ", but, rather, it says what is the value that is obtained by the complete evaluation of "e", if it converges.

Remark: We have only studied a small subset of Scheme, avoiding the use of any imperative operator (useful sometimes, but to be used with great care); In the purely functional fragment of Scheme, the only modifications we can produce on the environment are those obtained by the evaluation of define expressions.
Hence for our fragment, in the judgments on non define expressions, there would be no need to mention the possibility of modifying the environment.
However, the paper mentioned at the beginning of this section (whose first ten pages we have to study) describes a formal system for the whole Scheme language, and hence any expression is assumed to be able to modify the environment.

To go on now with our formalization, we need to formalize precisely the Scheme syntax.
We introduce some syntactic categories:

the syntactic category of the variables (called Var).
the syntactic category of the constants (called K).
... and so on (study them in the paper)

To formalize, instead, what we intend for a value, we define the set of values V. To define V we introduce a few semantic categories:

the semantic category of the values of the constants (called K).
For instance, to say that the value of 3 is the number 3 we write 3 -> 3
the semantic category of the closures (called C).

In the set of the possible values we put also the indefinite element, denoted by the symbol "?"
In the pages 4-6 of the paper, you can find the description of the other syntactic and semantic categories.

We formalize now the concept of frame.
We know that a frame is a set of associations variable -> value. A frame can then be defined as an element of the domain

F = [Var => V] (the set of partial functions from variables to values)

Example: Let us consider the frame

   ------------------ 
   | a -> 2         | 
   | b -> #t        | 
   | f -> <..., p>  | 
   ------------------

This is the graphical representation of a partial function: the function associating the value 2 to the variable a, the value #t to the variable b, and the closure <..., p> to the variable f (recall that closures belong to the set V of values).
If we call this function (frame) rho (rho is the symbol used in the paper to generically denote a frame), we have that
rho(a) = 2
rho(b) = #t
rho(f) = <..., p>

During the evaluation of an expression, a frame can be modified (by evaluating define expression, for instance). We denote by
rho[pippo, 5]
the extension of the frame rho is extended with the new association pippo -> 5.
This notation makes explicit that rho[pippo, 5] is frame which is identical to rho, except for the new association pippo -> 5.
If we write
rho[a, 5]
we are donoting the frame rho with the value 5 associated to the variable a; in practice it is like overwriting a, because, in this particular rho we are considering as example, a is already associated to 2.

The empty frame is denoted by the emptyset symbol (for typographic reasons we shall write "ef").
The empty frame will be needed when we describe the application of a closure to an argument, since we need to denote a new frame with the associations formal parameters -> actual parameters.
To denote, for example, a new frame with the association x -> 1 we shall write
ef[x, 1]

We formalize now the concept of environment.
An environment is a set of frames linked together by means of pointers. In our formal description an environment will always have a tree structure. The formalization of the operational semantics will guarantee that we always abtain tree-shaped environments out of tree-shaped environments.
An environment is a function that, given a pointer, returns the pointed frame and the pointer to its father. The pointer to the father is needed since, when the value of a variable is not present in the current frame, we need to look for it in the father frame or possibly in the father of the father and so on....
Let Loc be the domain of pointers (locations). So an environment is an element of the following domain:

Env = [Loc => (Loc U (top) x F)] where F = [Var -> V]

Env is the set of partial functions from pointers to pairs of the form (pointer, frame). Recall that in an environment there always exists the top level frame (the fatherless one). That is why we have added the symbol "top" in the definition.

Let us consider the environment zeta containing the frame rho of the example above. Let n be pointer to the frame rho end let us assume the rho's father to be pointed at by n' (n'=top in case rho be the top level frame).
Then we have that zeta(n) = (n',rho)
If we add the association x -> 1 to rho, the whole environment is modified and we denote this new environment by the following expression:

zeta[n, (n', rho[x,1])]
This notation tells us that this environment is identical to zeta, but for the fact that now the pointer n points to the modified frame rho (containing the new association x -> 1).

Now we can finally present the Formal System of Scheme Structured Operational Semantics .

At the base of the system for expressions we have axioms of the following shape:

zeta, n |-- 3 -> 3
These axioms state that the value of the constant 3 is the number 3 in any possible environment and in any possible current frame.

zeta, n |-- (+ 2 5) -> 7
Axioms like the above one shows how predefined functions behave. There will naturally be judgments of this sort for every constant and for every predefined function. Look in the paper (at page 7) for the general form for this type of judgments.

Now, let us have a look at rules (var 1) and (var 2). As a matter of fact (var 1) not a rule, it is an axiom.
A rule, to be such, must have some judgments as assumptions and a judgment as conseguence. In (var 1) there is no premise judgment (those that are above the line are just some additional conditions).
The axiom (var 1) says:
"In an environment z, using n as current frame pointer, if n points to a frame rho containing the identifier x associated to a non indefinite value, then the value of x is rho(x)."

We could read the same axiom bottom up. In this way it represent what the interpreter does when evaluating a variable x:
"If x is evaluated in the environment zeta, using n as current frame pointer, we first have to check whether x is in the frame pointed by n. If it is so, its value is non indefinite, its value is rho(x)."

The rule (var 2) describes how the interpreter evaluates a variable x in case it is not in the pointed frame. It looks for it in the father frame. (var 2) is a real inference rule and in this case the premise judgment is "zeta, n' |-- x -> v, zeta ".
(var 2) states that:
"In zeta, using n as current frame pointer, in case x does not belong to the domain of rho (the frame pointed at by n), and rho is not the top level frame (n' is not Top), the value of x is to be looked for in rho's father; if the evaluation of x in rho's father (zeta(n')) returns v, then v is also the value of x in zeta and n."

The notion of error in our system is formalized by assuming that an error is produced by the interpreter whenever no rule is applicable, i.e. the expression we need to evaluate is in no consequence judgment of any rule (and therefore we cannot succeed in deriving any judgment for our expression).
For example, if I look for a variable x which is not present in any frame of zeta, none of the two preceding rules can be applied and therefore the interpreter returns an error.

Now we analyse the rule that formalizes the evaluation of applications. An application in Scheme is an expression of the form (e₀ e₁... e_m), where e₀ has to be an expression whose value is a closure or a predefined function. For example e₀ could be +, or (lambda (x) (- x 3)) or (lambda (x) x) +), etc, etc.
We define, then, two rules, one for each case. Study the definitions of (appl 1) and (appl 2) in the paper (page 8).

Let us consider (appl 1).
This rule has a lot of premises. As a matter of fact, some of them are real premises, but some are just conditions needed to guarantee the validity of the rule.
The function "Next" will be used in the rule. "Next" takes an environment zeta as input, and returns a a fresh pointer p not yet used in zeta.
The rule, read bottom up, states that:

In zeta₀, using n as current frame pointer, the value of an expression (e₀ e₁... e_m), is obtainined by first checking the validity of the following hypothesis:

Hp 1. In zeta and n, the value of e₀ must be a closure.

Notice that the number of arguments (x₁ x₂... x_m) of the lambda-expression has to be equal to the number of parameters e₁ e₂... e_m applied to e₀.
(d₁ d₂... d_p) are the possible local define expressions in the body of the lambda-expression.

Hp 2. The expressions e₁ e₂... e_m have to be evaluated from left to right. Let v₁ v₂... v_m be their values.
The evaluation of each e_i can modify the environment. That's why we denote by zeta' the environment before the evaluation of e_i, and by zeta_i+1 the environment after its evaluation. At the end of the evaluations of all the e_i's the environment will be zeta_m+1.
Hp 3. will be explained later on.
Hp 4. zeta_m+1 has to be modified by extending the frame pointed by n' (the pointer in the closure which is the value of e₀) with a new, temporary frame, containing the associations formal parameters -> actual parameters. We call zeta'₁ such modified environment.
A new frame is created (by modifying an emptyframe), containing the associations formal parameters -> actual parameters. Such a frame is pointed by Next(zetam+1), its father is the one pointed by n'.
Hp 5. The local definitions (d₁ d₂... d_p) have to be evaluated.
They are evaluated using Next(zeta_m+1) as current frame pointer, that is the temporary frame that we have added. The evaluation of each the local define expression d_i can modify the environment from zeta'_i to zeta'_i+1.
Hp 6. Now the body of the function can be evaluated.
It is evaluated in zeta'_p+1 (the environment we got after the evaluation of all the local define expressions), using Next(zeta_m+1) as current frame pointer. The evaluation of "e" can modify the environment from zeta'_p+1 to zeta''. Let "v" be the value of "e".

If all the hypotheses above are verified then the value of the application (e₀ e₁... e_m) in zeta, with n as current frame pointer, is "v", and the environment after the evaluation is zeta''.

Remark. At the end of the evaluation there is no reference left to the frame Next(zeta_m+1). When the garbage collector will realize that the part of memory that implements this frame is not referenced by any pointer anymore, it will add it in the heap so it can be used again.

We now discuss the Hypothesis 3.
The body of the lambda-abstraction can be preceded by some local define expressions. When the new temporary frame with the associations formal parameters -> actual parameters is created, the interpreter associates in it also the indefinite value ("? ") to the variables of the define expressions.
For example, if before the body of the lambda expression there is
(define to 3)
(define b 5)
then the frame with the associations formal parameters -> actual parameters contains also the following associations:
a -> ?
b -> ?
When the define expressions will be evaluated (see the Hypothesis 5) the indefinite value "?" will be overwritten with the real values to be associated to the variables.
In the Hypothesis 3 the set {y₁, y₂,...y_q} denotes the set of such variables.
Notice that, since there could be more than one define expression for the same variable, the number p of define expressions can be greater than the number q of variables.
This sort of "inizialization" with "?" of the define variables is denoted in the rule by [y₁ ?]... [y_q ?].
Let us explain by means of an example why such initalization is needed.
We define H as a function that takes an argument and contains two local define expressions.

 (define H 
     (lambda (arg) 
        (define S (+ c 1)) 
        (define c 3) 
        (+ (+ S c) arg)))

According to the semantics just defined, when we evaluate an application of H to an argument, say (H 5)on H, the interepreter extends the frame in which H has been defined and inserts in it the following associations:
arg -> 5
S -> ?
c -> ?
Then the define expressions are evaluated.
When the first one is evaluated, the value of (+ c 1) is associated to S. But the value of c is "? " and the semantics of "+" says that its operands cannot have indefinite values, therefore the interpreter returns error.

What would have happened if the variables would not have been initialized with the indefinite value "?" ?
The interpreter would have looked for the value of c in the current frame and, not finding it, it would have looked for it in the father frame. Then, without the "initialization" of the variables of the define expressions, we would have got a different result.
What is then the purpose of the given semantics?
It forces the scope of the define variables to be the body of the lambda expression and moreover it forces a variable to be defined before its use. If the interpreter finds that a local variable is used before its definition, it returns error (instead of going up to the father to look for a value associated to the variable).

Read by yourself the other rules of the semantics in the paper (not the ones for "begin" and "set")

NOTE: Many exercises on operational semantics ask to take a rule and to modify it in a certain way. By means of these exercises you can check whether you have undertood how to "read" the rules. Of course, by modifying one of these rules one completely changes the way the interpreter evaluates expressions.
If, for example, in the Hypothesis 1 and 4 of (appl 1) we put n instead of n', it would result in a complete different behaviour of the interpreter: Using n' the rule indicates that the frame where the function has been defined is extended and here its body is evaluated. Using n, instead, the rule would indicate that the frame to be extended is the one where the function is evaluated. In this way Scheme would be a language with dynamic scoping.

Lecture Notes on Scheme Operational Semantics

Lecture Notes on
Scheme Operational Semantics