Introduction to Scheme

A functional programming language, in a nutshell, is nothing more than the lambda-calculus + a reduction strategy (with the definition of what is a value, in case of a call-by-value strategy) + a type discipline (only in case of typed languages). All the rest is, in a sense, just syntactic sugar.

In Scheme, in fact, we have the notion of variable, application and lambda abstraction. Its reduction strategy (that we shall precisely define when studying the formal operational semantics of Scheme) is a call-by-value reduction strategy. Scheme contains predefined constants (for numbers, for boolean values, for lists etc.) and predefined functions (on numbers, like +, on boolean values, like or, on lists, like cons.) What is a value for Scheme? well, any predefined constants is a value, a list of values is a value (a variable is not! unless it has been previously associated to a value in the environment by means of a define expression), a function is a value (even if a bit improperly, this consists in saying that a lambda expression is a value; later on we shall see that this notion has to be better specified).

We used before the notion of "environment". We can think of an environment as a set of associations (variable --> value). At the beginning of a Scheme session the environment contains only the predefined associations (like that of the identifier "empty" to the value empty list). The environment can be modified ("modify"... bad word in the functional word!!) by evaluating a define-expressions. Nothing similar to a define-expression is present in the syntax of the lambda-calculus (and this is reasonable, since define-expressions are sort of associated to the notion of modification, which is a bad one in a pure, heavenly functional world...). Define-expressions are needed in real languages for efficiency and expressiveness motivations (otherwise it would be very difficult to write actual programs and we would be forced to implement recursive algorithm by means of fixed-point combinators...).

What is the use of looking at a lambda expression as a value? Well, otherwise in the definition

(define pippo (lambda (x) (+ x 3)))
if the expression (lambda (x) (+ x 3)) were not a value, we could evaluate its subexpression (+ x 3), and this makes no sense.

Another thing in which Scheme differs from the lambda-calculus is that Scheme has functions with arity greater than one and a Scheme interpreter expects you to provide to a function its right number of arguments according to the arity of the function (Haskell, instead, considers all function to be implicitely curryfied)




Lecture Notes on Recursion, using Scheme

Recursion on numbers

First of all, you are kindly invited to study, on your favorite Scheme manual or textbook, the predefined operations and predicates on numbers, the sintax of lambda-expressions, define-expressions, if-then-else and cond expressions.

Recursion is a naturally applied process to define functions on recursively defined data (called also inductively defined data or algebraic data).

Rough intuition: A rough idea of how we can look at a recursive function definition is the following one.
When we define a function by recursion we first try to look at the elements of the domain as "built" out of "simpler" parts (or you can think about it in terms of "smaller" parts). Then we try to express the value of the function on a generic element in terms of the value of the function on its "simpler" parts. We need also to provide the value of the function on the "simplest" ("smallest") elements. We must also be sure that, starting from a generic element, it is not possible to go on getting simpler and simpler (smaller and smaller) elements for ever.

The easiest inductive data are obviously the natural numbers: only one base element (zero) and a unary constructor (the successor function).

Let us define a simple recursive function on numbers, which sums up all the natural numbers from n downto 0.

(define (sumdowntozero n)
      (if (= n 0)
          0
          (+ n (sumdowntozero (- n 1)))))

You notice that the base case is 0, but it could be different, like in this other example, where we sum up all the numbers from a number "from" downto a number "to", with "to" less than or equal to "from".

(define (sumfromdownto from to)
      (if (= from to)
          to
          (+ from (sumfromdownto (- from 1) to))))

Notice that we have always in mind that a recursive function must have the recursive calls performed on "smaller elements". For numerical inputs this means "smaller numbers".

But then, what about the following function, which returns the same result as the previous function?

(define (sumfromupto from to)
      (if (= from to)
          to
          (+ from (sumfromupto (+ from 1) to))))

Well, it works. And it is still true that recursive calls are on "smaller" elements down to the base case. It is just that now the relation of "smaller" is not the usual one on natural numbers (actually is the opposite relation). We will be able to be more precise when studing recursion and induction principles (in particular well-founded sets).



Recursion on (flat) lists

Now, let us turn to another predefined inductive type in Scheme, the lists.
By a flat list we mean a list were no element in it is a list.

You are (always kindly) invited to study on your favorite Scheme manual or textbook the main Scheme operators on lists (empty, null?, cons, car, cdr, caar etc.).

Let us see the append function

 (define  append (lambda ( ls1 ls2 )
     ( if  (null? ls1)
           ls2
           ( cons (car ls1) (append (cdr ls1 ) ls2)) )))

Exercise: Program the reverse function.

The append function is an example of function defined by recursion on the lenght of the list, this informally means that the recursive calls are on shorter lists than the input one.

Let's define the function "member?". It checks whether a number is in the list of numbers given as input. (it is a Scheme convention to put a question mark as final character of names of predicates, i.e. of functions returning boolean values)

(define (member? elem ls)
    (if (null? ls)
        #f
        (if (= elem (car ls))
            #t
           (member? elem (cdr ls)))))

We can use boolean operators in order to get a simpler and clearer definition. Boolean operators are very helpful to define predicates.

Before going on you are invited to study the the boolean operators of Scheme in your favorite Scheme manual or textbook.

(define (member? elem ls)
   (and (not (null? ls)
        (or (= elem (car ls))
            (member? elem (cdr ls)))))
 

Exercise: write functions to sort a list using the different sorting algorithms you know.

Now let's consider another example of recursion on the length of a list: a function summing up all the numbers present in a list of numbers.

(define (Sumlist ls)
     (if (null? ls)
         0
         (+ (car ls) (Sumlist (cdr ls)))))

Intuition: How did we manage to program this function? Well, first we provided its result ( 0 ) in case the input be the basic element of the domain (the empty list). Then, by assuming that our program works correctly on lists that are "smaller" than the input, (Sumlist (cdr ls)), we "built" the result for the generic input ls: (+ (car ls) (Sumlist (cdr ls))).
We considered (cdr ls) to be "smaller" than ls.
But why do we have to use "smaller" inputs for the recursive calls?
In order to guarantee the computation eventually stops. In fact we cannot go on forever calling the function on smaller and smaller elements. Sooner or later we reach the emptylist, on which the function is correctly defined.
The problem is how we can be sure that it is not possible to have an infinite sequence of smaller and smaller elements.
For the present function there is no problem, since a "smaller" input consists in a shorter list, and it is impossible to have an infinite sequence of shorter and shorter lists.
But we shall see that it is not always so simple....

So, we need
(1) to provide the result for the simplest elements of the domain;
(2) to provide the result for all the other possible "shapes" of the input using the assumption that the program works correctly on "smaller" arguments than the input.;
(3) to be sure that there exists no infinite sequence of smaller and smaller elements

Point (3) above can be formally proved by using the notion of well-founded precedence relation.
Before introducing such a notion, let us look at onother example, where we use a different algorithm to sum the elements of a list of numbers.

Let us write a function that sums up the number in a list using the constraint that the first argument of an addition must always be 1 (it is a weird and inefficient way to sum up the elements of a list, but in our examples we do not care about efficiency, we simply wish to practice with recursion.)

(define (Sumlistweird ls)
     (cond ((null? ls) 0)
           ((= (car ls) 0) (Sumlistweird (cdr ls)))
           (else (+ 1 
	            (Sumlistweird (cons (- (car ls) 1) 
		                        (cdr ls))))))

Intuition: In this case we provided the value for the simplest element and for all the possible shapes of the input different from the emptylist:
(in an informal notation)
Sumlistweird emptylist = 0
Sumlistweird (cons 0 (cdr ls)) = Sumlist (cdr ls)
Sumlistweird (cons n (cdr ls)) = 1 + (Sumlist (cons (- n 1) (cdr ls)))

We assumed that Sumlistweird works correctly on "smaller" elements than ls. But now we are considering both (cdr ls) and (cons (- n 1) (cdr ls)) to be smaller than ls! Notwithstanding (cons (- n 1) (cdr ls)) has the same lenght as ls!
Actually this is not a problem, we need only to be sure that, according to this particular notion of "being smaller" there is no infinite sequence of smaller and smaller lists. But how can we be sure of that?
The notion of well-founded precedence relation will help us.

Study the Notes on well founded sets

We can now be sure that Sumlistweird is correctly defined. You see, the recursive calls are on lists that have a smaller first element or that are shorter. So the precedence relation implicitely defined by the recursive calls is a lexicographic one, on the first element and on the lenght.

So, in general, to be sure of termination, we need to consider the precedence relation induced by our program (or a larger one if it is easier to handle) and prove it well-founded.

A real formal proof of termination of programs can be obtained by means of the logical principle of well-founded induction, which is a generalization of the well-known principle of mathematical induction. Using such a principle we can prove many more properties of programs than simply termination.
Study it in Induzione e Ricorsione (ps file) only when you have practiced a lot with recursion


Recursion on nested lists

In Scheme lists are not homogeneus, so we can have anything in a list, also other lists.

One-level nesting

Let us consider a function on lists of lists of numbers.
The following function returns the sum of all the numbers present in a list of lists of numbers. (We call lsls the argument, in order to recall that it is a list of lists (of numbers)).

(define (Sumlistlist lsls)
    (if (null? lsls)
        0
        (if (null? (car lsls))
            (Sumlistlist (cdr lsls))
            (+ (caar lsls) 
	       (Sumlistlist (cons (cdar lsls) 
	                           (cdr lsls)))))))

Useless to say that the above function could be defined far more easily by means of the functions i Sumlist and map (do it as exercise).
But now we are not interested in the best ways to make sums, we are just playing with recursion in its various forms.

Now we write a "weird" version of Sumlistlist: a version where "+" can be used only with 1 as first argument.

(define (Sumlslsweird ls)
   (cond ((null? ls) 0)
         ((null? (car ls)) (Sumlslsweird (cdr ls)))
         ((= (caar ls) 0) (Sumlslsweird (cons (cdar ls) 
	                                      (cdr ls))))
         ( else (+ 1 
	        (Sumlslsweird (cons (cons (- (caar ls) 1) 
		                          (cdar ls)) 
                                    (cdr ls)))))))

This is an example of triple induction: on the length of the list, on the length of the first element and on the numerical value of the first number of the first element of the list. In other words, the induced precedence relation is a lexicographic one, on triples.


Multiple levels of nesting

We consider now lists containing numerical elements or other lists at any level of nesting.

Let us define the function (actually a predicate) genmember?.
Such a function checks whether an element (number or symbol) is in a list containing symbols or numbers at any level of nesting (corrected by C. Pistagna).

Study in your manual the eqv? predicate.

(define (genmember? elem ls)
    (if (null? ls)
         #f
        (if (list? (car ls))
            (or (genmember? elem (car ls))
                (genmember? elem (cdr ls)) )
            (or (eqv? elem (car ls))
                (genmember? elem (cdr ls))))))

We write now a function that sums all the numbers present in the input (a list containing numerical elements or other lists at any level of nesting.)

(define (Gensumls llss)
    (if (null? llss)
        0
        (if (list? (car llss))
            (+ (Gensumls  (car llss))
               (Gensumls  (cdr llss)) )
            (+ (car llss) (Gensumls  (cdr llss))))))
 

Notice that the recursion is not on the level of nesting. In fact (cdr llss) could have the same level of nesting of llss.



Because of the absence of types in SCHEME, we can do ourselves the check that the language does not make.

Let's define a predicate checking whether an argument is really a list of lists of numbers, by first defining a predicate checking whether an argument is a list of number.

(define (lis-num? x)
  (and (list? x)
       (or (null? x)
           (and (number? (car x))
           (lis-num? (cdr x))))))

(define (lis-lis-num? x)
  (and (list? x)
       (or (null? x)
       (and (lis-num? (car x))
            (lis-lis-num? (cdr x))))))

If we wish to define lis-lis-num? directly, without using lis-num?, we have to do as follows, using a more complex recursion, inducing a lexicographic precedence relation. By the way, what is precisely the precedence relation? Mind that the input domain is the domain of all possible values.

 (define (lis-lis-num2? x)
    (or (null? x)
        (and  (list? x)
              (list? (car x))
              (or (and (null? (car x))
                       (lis-lis-num2? (cdr x)))
                  (and (not (null? (car x)))
                       (number? (caar x))
                       (lis-lis-num2? 
		            (cons (cdar x) 
			          (cdr x))))))))



If somebody knows the game of the Hanoi towers, the program to solve it in the right way is the following.

(define (tower-of-hanoi n peg1 peg2 peg3)
   (if (= n 0)
       '()
       (append (tower-of-hanoi (- n 1) 
                               peg1 
			       peg3 
			       peg2)
               (cons (list peg1 peg3) 
                     (tower-of-hanoi (- n 1) 
		                     peg2 
				     peg1 
				     peg3)))))

In this definition of the tower the problem is defined by giving the number of disks and the name of the three pegs (that can be also be identified by a number); it is assumed that the first one is where all the disks are at the beginning. The third one is where the disks have to be moved to. In the above definition, if we put '(peg1 peg3) instead of (list peg1 peg3) we would get a wrong result. Why?.

Of course we can have recursion on all sort of inductively defined Data Types. Before looking at recursive programs on such types you are invited to study the Lecture Notes on Data Types..



Recursion on Trees

Let us define a few functions on binary trees with numerical labels.

The following function computes the height of a tree

(define (max n m)
    (if (> n m) n m))

(define (height tree)
         (if (emptyNBTree? tree)
             0
             (+1 (max (height (leftNBTchild tree))
	              (height (rightNBTchild tree))))))

Here is the Scheme function that produces the list of the labels of a tree using an inorder visit.

(define (inorder tree) 
    (if (emptyNBTree? tree) 
        '() 
        (append (append (inorder (leftNBTchild tree))
                        (list (labelNBT tree)))
                (inorder (rightNBTchild tree)))))

We notice that our notion of "being smaller" in this case is that of having smaller height. The recursion is on the height of the tree.

We define now three different versions of the function that returns the sum of all the labels present in a NBTree.
The second and third functions use algorithms that are not very natural. We define them as examples of complex sorts of recursion that are actually used sometimes.
The simplest version, which uses recursion on the height of the tree is the following.

define (Sumtree tree)
    (if (emptyNBTree? tree)
        0
        (+ (labelNBT tree) 
	   (Sumtree (rightNBTchild tree)) 
	   (Sumtree (leftNBTchild tree)))))

The next version works on the label of the bottom leftmost leaf. First we define some auxiliary functions.

(define (minusOneleftmost tree)   
        ;; returns the tree with the label 
        ;;of its bottom leftmost leaf
        ;; decreased by 1 or without such a leaf 
        ;; if the label is equal to 1
  (cond ((emptyNBTree? tree)  emptyNBTree)   
        ((isleaf? tree) 
	      (if (= 1 (labelNBT tree))
                  emptyNBTree
                  (mkNBTree (- (labelNBT tree) 1) 
	                    emptyNBTree 
                            emptyNBTree)))
         ((emptyNBTree? (leftNBTchild tree)) 
	      (mkNBTree (labelNBT tree) 
		           emptyNBTree 
			   (minusOneleftmost 
			         (rightNBTchild tree))))
         ( else (mkNBTree (labelNBT tree) 
	                  (minusOneleftmost 
			         (leftNBTchild tree)) 
			  (rightNBTchild tree)))))

(define (Sumtreeweird tree)
  (if (emptyNBTree? tree)
      0
      (+ 1 (Sumtreeweird (minusOneleftmost tree)))))

On what is the recursion?


Just to stress how much more elegant is Haskell, let us give the Haskell version of the last function, heavily using patter matching (it is possible to write even a more elegant version with a less heavy use of pattern matching).

minusOneleftmost  EmptyNBTree  =
          EmptyNBTree 

minusOneleftmost (MkNBTree 1 EmptyNBTree EmptyNBTree) =
          EmptyNBTree

minusOneleftmost (MkNBTree n EmptyNBTree EmptyNBTree) =
         (MkNBTree n-1 EmptyNBTree EmptyNBTree)

minusOneleftmost (MkNBTree n EmptyNBTree t2)          = 
         (MkNBTree n EmptyNBTree (minusOneleftmost t2))

minusOneleftmost (MkNBTree n t1 t2)                   =
         (MkNBTree n (minusOneleftmost t1) t2)


Sumtreeweird EmptyNBTree = 0
Sumtreeweird t = 1+(minusOneleftmost t)


Let us define in Scheme the function Scheme that takes a stack of numbers and "delete" all the elements down to the uppermost 0.

(define (f stk)
    (cond ((emptyNStack? stk)  stk)
          ((=(topNS stk) 0) stk)
          (else (f (popNS stk)))))
 

In Haskell we would have

f  EmptyNStack   =  EmptyNStack 
f (PushNS 0 stk) = stk 
f (PushNS n stk) = f stk 

We consider the data type of lambda-terms in Haskell and write the equality predicate between variables.

eqvars (MkVar n) (MkVar m) = (n == m)

Let us define the Haskell function that returns the set of the bound variables of a term (we use the parametric data type Set)

boundVars (MkVat v)  =  Emptyset 

boundVars (Appl lt1 lt2)   =
     union (boundVars lt1) (boundVars lt2) 

boundVars (LambdaAbs v lt) =
     insert v (boundVars lt)  

Of course we need first to define the set union function.
Exercise: Do it, together with the function belongsTo.

Exercise: define the function that returns the set of the free variables of a lambda term.



Recursion on Sarchiapones

Well, now you have understood that indeed the mechanism for programming functions over recursively defined (inductive) data types is always more or less the same. We can easily define functions on sarchiapones (see the definition of the Sarchiapone data type in the Notes about inductive data types)

Let us write in Scheme and in Haskell the function porompoppo that takes in input a sarchiapone S and and two natural numbers n and m. porompoppo returns a sarchiapone similar to S but in which all the occurrences of n are replaced by m.

(define (replace ls n m)
    (if (null? ls)
        '()
        (cons (if (= (car ls) n)
	          m 
		  (car ls))
              (replace (cdr ls) n m))))


(define (porompoppo sarch n m)
   (cond ((or (manzanillo? sarch) (minollo? sarch))      
	    sarch)
         ((tirulero? sarch) 
	    (tirulero (if (= (takenumtirulero sarch) n)
                          m
                          (takenumtirulero sarch))
                      (porompoppo 
		            (takesarchtirulero sarch)
			    n 
			    m)))
         (else (zumpappero 
	           (porompoppo 
		         (takefstsarchzump sarch) 
			 n 
			 m)
                   (porompoppo 
		         (takesndsarchzump sarch) 
			 n 
			 m)
                   (replace 
		         (takelistsarchzump sarch) 
			 n 
			 m)))))

Now in Haskell

replace [] n m     = []
replace (x:xs) n m = 
    (if (x == n) then m else x) : (replace xs n m)


porompoppo :: 
  Sarchiapone -> Integer -> Integer -> Sarchiapone

porompoppo Minollo n m = Minollo
porompoppo Manzanillo n m = Manzanillo
porompoppo (Tirulero k sarch) n m =
     Tirulero 
         (if (k == n) then m else k)
         (porompoppo sarch n m)
porompoppo (Zumpappero sarch1 sarch2 ls) n m =
     Zumpappero 
         (porompoppo sarch1 n m)
         (porompoppo sarch2 n m)
	 (replace ls n m)