Notes OO and OCaml

Lecture Notes on type systems for OO languages and OCaml

(courtesy of D. Contarino)

In these notes we shall take into account some general (and somewhat) problematic aspects of type systems for class-based O.O. languages and how they are dealt with in the language OCaml, a functional language derived from Caml and ML with imperative and object-oriented features. We shall have a look here mainly at the OO features of OCaml.

Before starting, in order to prevent anyone thinking that OCaml is a language of no real use, please have a look at the site of the firm Jane Street and in particular to the link "Technology".

Advantages of statically typed languages

We first recall some of the many advantages of using statically type-checked languages ("statically type-checked" has the same meaning as "strongly typed"):

providing earlier, and usually more accurate, information on programmer errors;
eliminating the need for run-time type checks that can slow program execution and increase program size;
providing documentation on the interfaces of components (e.g., procedures, functions, and packages or modules);
providing extra information that can be used in compiler optimizations.

(K.Bruce)

As we shall realize in the following, many type problems and rigidities present in statically typed object-oriented languages mainly depend on the

conflation of type with class;
mismatch of the inheritance hierarchy with subtyping

Notice that both of these causes of problems are present in Java.

We recall that Java is a language with nominal subtyping (see [FB]). OCaml uses instead structural subtyping. In such a language, as in many other O.O.languages, the types of objects are similar to the record types defined in [FB].

In Ocaml we can define single objects (as in object-based languages), or classes (as in class-based languages). However, besides the possibility of defining single objects, OCaml does not have all the other features of object-based languages. It has instead all the features of a class-based language.
We shall not deal at all in this course with object-based languages (also called prototype-based languages). These ones are object-oriented languages that do not have classes; objects instead inherit their code and data directly from other "template" objects. An example of a commonly used prototype-based language is JavaScript.

The following is the definition of a single object in Ocaml representing a two-dimensional point.

let myPt =
     object
       val mutable x = 0           
       val mutable y = 0
       method get_x = x
       method get_y = y
       method setcoord new_x new_y = x <- new_x ; y<- new_y
     end;;

In order to have the interpreter evaluate an expression or a definition we have to write a double semi-colon (;;) after it.
When the interpreter evaluates such a definition it returns

val myPt : < get_x : int; get_y : int; setcoord : int -> int -> unit > =
  < obj >

so telling you that you have just correctly defined a value called myPt; the type of such a value is a record type, in particular

< get_x : int; get_y : int; setcoord : int -> int -> unit >

The value named myPt is < obj > (specific object values are not explicitely shown).
Notice that the type of objects in OCaml are record types containing only the types of the methods (so a record type can be looked at as an interface). Object types do not mention instance variable since they are not publicly accessible.
The two coordinates of myPt are defined as mutable since they are variables in the very sense of imperative programming.
The method setcoord modifies such variables. In fact

x<- new_x

is an assignment in the imperative sense.
An assignment for OCaml is an expression and like any expression in functional programming, it has a value. By convention in OCaml the type of an assignment (or a sequence of assignments) is unit.
The value returned by evaluating an assignment is (), which is the only element of the type unit.

The approach to types of Ocaml is a' la Curry. Like in Haskell types are inferred.
The type system of the functional part of OCaml is similar to that of Haskell, with the same sort of polymorphism (type variables are called in OCaml 'a,'b, etc).
Similarly to Haskell, when ocaml is started we can interact with the interpreter, giving to it expressions to evaluate, or definitions as the one above.

Let us define a function that move a bit a point along the diagonal:

let bumpd p = p#setcoord (p#get_x +1) (p#get_y +1);;             (*)

A method in OCaml is invoked by using '#'.
The type of bumpd inferred by the type system is the most general one, namely

val bumpd :
  < get_x : int; get_y : int; setcoord : int -> int -> 'a; .. > -> 'a 
 = < fun >

An input for bumpd can then be any any record containing at least (this is the sense of the dots ".." used by OCaml in the record type above) three methods: get_x, get_y and setcoord, with the specified types.
In fact in OCaml an expression of the form < m:T; ..> denotes the type of any object containing at least the method m.
So, bumpd can be applied to myPt.

If we wished bumpd to work only on elements having type
< get_x : int; get_y : int; setcoord : int -> int -> unit >
we should impose a restriction on the type of its argument, by writing, instead of the previous definition, the following one

let bumpd (p:< get_x : int; get_y : int; setcoord : int -> int -> unit >)      (#)
            = p#setcoord (p#get_x +1) (p#get_y +1);;

The type of bumpd is now

val bumpd :
  < get_x : int; get_y : int; setcoord : int -> int -> unit > -> unit = < fun >

Self

All O.O. languages enable us, as Java for instance, to refer to the object itself.
In OCaml, we have to declare which name we wish to use for the self. This is obtained by writing (self) after the keyword object. Usually the name self is used, but any other one will work the same.
Let us use the self in the following definition of a point that has a method enabling it to be moved.

let myMovablePt =
  object (self)
    val mutable x = 0           
    val mutable y = 0
    method get_x = x
    method get_y = y
    method setcoord new_x new_y = x <- new_x ; y<- new_y
    method move inc_x inc_y = self#setcoord (x + inc_x) (y + inc_y)
  end;;

Classes

In all class-based O.O. languages the notion of class is obviously a central one.

Classes: extensible templates for creating objects, providing initial values for instance variables and the bodies for methods.

All objects generated from the same class share the same methods, but contain separate copies of the instance variables.

Here is the definition in OCaml of a class point whose state is made of two integer coordinates initialized with 0.

class point =                                  
    object 
      val mutable x = 0
      val mutable y = 0                                         (**)    
      method get_x = x
      method get_y = y
      method setcoord new_x new_y = x <- new_x ; y<- new_y
    end;;

The interpreter, after the evaluation of such a class definition returns

class point :
  object
    val mutable x : int
    val mutable y : int
    method get_x : int
    method get_y : int
    method setcoord : int -> int -> unit
  end

so telling us that the class point is correctly defined and describing which are the types of the instance variables and methods of its objects.
As natural in O.O. languages, instance variables are in general not accessible from outside of the object's methods.
In OCaml, like many other O.O. languages, it is possible to modify the visibility scope of attributes and methods, if needed. We shall not deal with such a topic here.

The type of objects of class point is the very same of that of the object myPt we defined before. point is just the name we give to that record type.

In OCaml, like in many O.O. languages: a class name is a

name for the class
name for constructor
name for the type of the objects

Some authors, however, suggest that it is better not to conflate these uses (K.Bruce). Java, by using nominal typing, goes even further: the name of a class is the type of its object (a rather problematic issue).

Going on with our OCaml class example, we can now create a new element of class point as follows.

let mypoint = new point;;

The elements of class point have type point, but in OCaml, which has structural typing, the name point is simply a shorthand for the record type containing the types of the methods, that is

< get_x : int; get_y : int; setcoord : int -> int -> unit >

This means that myPt, the single point we created before, and mypoint, an element of the class point, have exactly the same type.
So we can apply bumpd to myPt or mypoint without problems:

bumpd myPt;;

bumpd mypoint;;

As a matter of fact, the function bumpd could have been defined equivalently as

let bumpd (p:point) = p#setcoord (p#get_x +1) (p#get_y +1);;

Notice that also the following strange point

let myStrangePt = object 
        val mutable x = 0
        val mutable y = 0
        method get_x = 3
        method get_y = 4
        method setcoord new_x new_y = x <- y ; y<- new_y * new_x
        end;;

has the very same type of myPt, and mypoint, that is
< get_x : int; get_y : int; setcoord : int -> int -> unit >
Hence it is type-safe to use it as argument of bumpd

bumpd myStrangePt;;

In OCaml we can define a class in such a way we can initialize the instance variable by providing arguments to the constructor of the class.
This means that, instead of defining the following class of cells

class cell =
  object
  val mutable contents = 0
  method read = contents
  method write new_cont = contents <- new_cont
  end;;

we could define

class cell (init_cont:int) =
  object
  val mutable contents = init_cont
  method read = contents
  method write new_cont = contents <- new_cont
  end;;

By having this last definition evaluated we get

class cell :
  int ->
  object
    val mutable contents : int
    method read : int
    method write : int -> unit
  end

so we can create a cell c containing the number 5 by writing

let c = new cell 5;;

Notice that if we do not constrain init_cont to be of type int, we get an error.
That sounds strange, since we have parametric polymorphism, like in Haskell, and hence one would expect the following definition

class cell init_cont =                        
  object
  val mutable contents = init_cont                                (***)
  method read = contents
  method write new_cont = contents <- new_cont
  end;;

to be typable in the most general way as

class cell :
           'a ->
           object
             val mutable contents : 'a
             method read : 'a
             method write : 'a -> unit
           end

and indeed the interpreter of OCaml infers exactly such a type for the above constructor of cell class.
The problem is that this is not a correct type. In fact the error message returned by the system is

The method read has type 'a where 'a is unbound.

Without getting now into a precise definition of unbound type variable we can easily understand where the error lies:
In case a definition like (***) were allowed, we could write

let cellA = new cell 5;;

let cellB = new cell "marameo";;

The type of both cellA and cellB would be cell, but the method read in cellA would have type int, whereas for cellB it would have type string, which is unreasonable.
We would have the same problem in Haskell if we were allowed to write

data BinTree = EmptyBT | Join (BinTree a) (BinTree a)

In OCaml it is possible to define polymorphic classes, but not as above.
The polymorphic (and correct) version of cell with initializing argument is in fact

class ['a] cell (init_cont:'a) =                   
  object
  val mutable contents = init_cont                               (****)
  method read = contents
  method write new_cont = contents <- new_cont
  end;;

By evaluating this definition we get

class ['a] cell :
  'a ->
  object
    val mutable contents : 'a
    method read : 'a
    method write : 'a -> unit
  end

You see, the above definition resembles the correct polymorphic type definition in Haskell

data BinTree a = EmptyBT | Join (BinTree a) (BinTree a)

where we have to specify the type parameter of the polymorphic type, as we do in OCaml by writing class ['a] cell. Similarly to the above parametric BinTree in Haskell, the definition (****) does not specifies a class, but rather a set of classes.

In fact if we write

let cellC = new cell 5;;

we get

val cellC : int cell = < obj >

whereas by writing

let cellC = new cell "marameo";;

we get

val cellC : string cell = < obj >

where int cell and string cell are two different types, as in Haskell BinTree Int and BinTree String are different types.
In fact in Haskell BinTre is a type constructor, and in OCaml cell can be looked at as a "class" constructors. This is why we need to specify "the argument" 'a and why, by omitting it, we get the error unbound type variable.

In OCaml, like in Haskell, we have implicit curryfication, and it applies not only to normal functions, but also to class constructors and methods.
For instance, we could define

class cls x y =
 object
 val mutable attr = x+y
 method m arg1 arg2 = arg1 * arg2
 end;;

obtaining

class cls :
  int ->
  int -> object 
           val mutable attr : int 
           method m : int -> int -> int 
         end

Notice that here we do not need to specify the type of x and y, since the intepreter can infer they are of type int by the fact that the operation '+' is applied to them.

Now we can define

let obA = new cls 1 2;;

obtaining

val obA : cls = < obj >

but we could also define

let obB = new cls 1;;

obtaining not an object, but a function

val oB : int -> cls = < fun >

We could also write

oA#m 3;;

obtaining

- : int -> int = <fun>

Subtyping

Subtyping polymorphism is a relevant notion that enforces the expressing power of O.O. languages.

The notion of subtyping in Ocaml for object types is the one that we have studied for record types in [FB].
This means that if we define the following object

let mycPt =
     object
       val mutable x = 0           
       val mutable y = 0
       val mutable color = "red"
       method get_x = x
       method get_y = y
       method get_color = color
       method setcoord new_x new_y = x <- new_x ; y<- new_y
     end;;

its type is

< get_color : string; 
  get_x : int; 
  get_y : int;
  setcoord : int -> int -> unit >

which is a subtype of

< get_x : int; 
  get_y : int; 
  setcoord : int -> int -> unit >

This means that we could give mycPt as argument to the function bumpd since the meaning of the subtyping relation is that if S <: T then in any place an expression of type T is expected, an expression of type S can be used in its stead.
In Ocaml, however, subtyping is not implicit, but explicit. This means that if we wish to replace an expression exp of type T with one exp' of type S, we have to coerce exp' to be of type T by writing exp':>T (it is like an uppercast in Java; notice that downcasts are not type safe and then are not allowed in OCaml). So, in order to apply the version (#) of bumpd to mycPt we have to write

bumpd (mycPt :> < get_x : int; 
                  get_y : int; 
                  setcoord : int -> int -> unit >);;

or, equivalently,

bumpd (mycPt:>point);;

Of course, had bumpd been defined like in (*), then both

bumpd myPt

and

bumpd mycPt

would have been correct, since in that case the type itself of bumpd expresses the possibility of having implicit (width) subtyping for its argument.

Dynamic Method Invocation

Another relevant feature of O.O. languages is the mechanism of dynamic method invocation.
It is a mechanism that enhances the flexibility of the language. It can summarized as follows: the object receiving a message is responsible for knowing which method body to execute.

Dynamic Method Invocation allows a program to send messages to an object of unknown origin as long as the object has a type that guarantees it has a method with the appropriate signature. Thus objects generated by different classes may be used interchangeably and simultaneously as long as they have the same object type. (K.Bruce)

(Actually such a mechanism is often referred to as Dynamic Dispatch, being the name "dynamic method invocation" used sometimes to denote the possibility of choosing at execution-time which is the method to be executed, not only which implementation of a method has to be executed)

According to such a mechanism, if we define the following function

let f (p : < get_x : int; 
             get_y : int; 
             setcoord : int -> int -> unit >) 
          = p#setcoord 3 2;;

it is then clear that in the evaluation of the following expression the correct body for the setcoord method will be used, whatever the value of the {bool-exp} will be.

f (if {bool-exp} then mypoint else myStrangePt);;

Dynamic method invocation plays an important role in the inheritance mechanism.

Subclasses

As in most class-based O.O. languages, OCaml possesses the notions of subclass, inheritance and code reuse.

A subclass is built incrementally out of a superclass and inherits instance variables and methods, and (possibly) modifies methods.

Let us see an example of subclass in OCaml.

If we consider the class point as defined in (**), the following is the subclass of colored points.

class colorPoint =
    object (self)
      inherit point 
      val mutable color = "white"
      method get_color = color
      method set new_x new_y new_color =
                              self#setcoord new_x new_y;
                              color<- new_color
    end;;

Here we inherit all fields and methods of point.

The extreme flexibility of OCaml makes it possible to define a subclass colorPoint for the class point even in case point were defined with initialization arguments, namely as

class point (init_x:int) (init_y:int) =                                     
    object 
      val mutable x = init_x
      val mutable y = init_y
      method get_x = x
      method get_y = y
      method setcoord new_x new_y = x <- new_x ; y<- new_y
    end;;

In such a case the subclass colorPoint could be defined as

class colorPoint init_x init_y (init_col:string) =
    object (self)
      inherit point init_x init_y 
      val mutable color = init_col
      method get_color = color
      method set new_x new_y new_color =
                              self#setcoord new_x new_y;
                              color<- new_color
    end;;

and its type description would be

class colorPoint :
  int ->
  int ->
  string ->
  object
    val mutable color : string
    val mutable x : int
    val mutable y : int
    method get_color : string
    method get_x : int
    method get_y : int
    method set : int -> int -> string -> unit
    method setcoord : int -> int -> unit
  end

Notice that we need to specify only the type of the color initializer, being the types of the other initializers specified in the superclass.

Method Overriding

In the above examples we do not have taken into account the mechanism of method overriding.

Method overriding enables to adapt a method to the requirements of a subclass, enhancing the flexibility of a language and exploiting the notion of code reuse.

Let us consider again the examples of classes point and colorPoint.

In this case we define the class point with the method for changing the spatial coordinates called set.
We override this method in the subclass of colored points. The overridden method in the colorPoint class, besides setting the spatial coordinates, sets the color field to "white"

class point =                                    
    object 
      val mutable x = 0
      val mutable y = 0
      method get_x = x
      method get_y = y
      method set new_x new_y = x <- new_x ; y<- new_y
    end;;

class colorPoint =                                    
    object 
      inherit point as super
      val mutable color = "white"
      method get_col = color
      method set new_x new_y = super#set new_x new_y; color<-"white"
    end;;

The keywords as super in OCaml mean that we can invoke the point version of the overridden method by using the name super (we could use any other word instead of super, as for the name self)

Notice how, in the overridden method set we have reuse of code. In fact part of the code of set is the code of the set method of the superclass.

The possibility of overriding in a subclass a method defined in one of its superclasses definitely enhance the flexibility of O.O. languages.

In overridden methods, unlike it has happened in our example above, it is not unfrequent the necessity of changing the type of its arguments and of the result.

------------
In the above definition of the set method in the colorPoint class it would seem reasonable to have also a third color argument. This, however, is not accepted by the interpreter of OCaml (and by most of the O.O. languages), since it would not be type-safe, in general.

In some languages the method definition for set with three arguments would be accepted, but it would be treated as a completely different function with the same name, i.e. as an overloaded function. We shall take into account later on the notion of overloading for O.O. languages.
------------

Now, we have a question left to be answered in general for O.O. languages:

Which sort of changes in the type of a redefined method are type-safe ?

The previous example shows that in a subclass it is extremely reasonable to have an overridden method with a type different from the type of the same method of the superclass. The change in the example, however, is not type safe. So, what are the allowed changes in order type correctness be preserved?

Notice that we wish to answer the above question in its broadest generality, i.e. without imposing any other requirement, like, for instance, that a subclass defines a subtype.

( To those people acquainted mostly with Java as O.O. language, it sounds strange even to think of a subclass not automatically defining a subtype. The identification of the notions of subclass and subtype, however, is not obvious at all; even more, it is in general unreasonable. This can be intuitively explained by considering that

Subtyping is a matter of types, while
Subclasses (and hence Inheritance) of implementations. )

Let us consider a simple example (in a OCaml-like sintax) by K.Bruce.
When discussing general issues of O.O. languages we use a sintax similar to OCaml in order not to have to introduce another sintax for a generic language.

Let us define a class aClass

class aClass =
   object 
       ...
   method m(s:S) = {mbody}     -where {mbody} is  of type T -
   method n(anS:S) = ... self#m anS ...  
       ...
   end

and a subclass of its

class aSubclass =
   object 
   inherits aClass as super 
   ...
   method m(s:S�) = {new mbody}     -where {new mbody} is of type T' -  
   ...
   end

We have overidden m of type S → T with a method of type S' → T'.
For which types S� and T' such an overriding is type safe?
If we define

let asubObj = new aSubclass;;

and evaluate

asubObj#n arg

the body of n is evaluated, where m is used in a context in which it is expected to have type S → T. In fact the body of n provide arguments of type S to m and expects from m a result of type T.
For objects of aSubclass, instead, m is used with type S' → T' since it has been overridden.
So, the only way to be sure the occurrence of self#m in the body of n be still compatible with the rest of such a body is to require that S' → T' <: S → T (meaning that we can safely substitute an expression of type S' → T' for an expression of type S → T.) This implies the requirement, by definition of (type safe) subtyping for arrow types (see the section on subtyping in [FB]), that S <: S' and T' <: T.
So, in general, it is type-safe to override a method of type T₁ with one of type T₂ whenever T₂<:T₁. This means that, for methods that have an arrow type, it is type-safe to change contravariantly the type of the inputs and covariantly the type of the result (moreover, the number of arguments of the method cannot be changed.)

So, in order to retain type safeness when defining subclasses, we should change the type of methods precisely as in a subtyping relation!

Were we wrong then when we said that the notions of subclass and subtyping cannot coincide in general?
Unfortunately not.

Consider the following quite natural example.
Let us define a class aClass2

class aClass2 =                            (@)
   object 
       ...
   method m(s:S) = {mbody}    
       ...
   end

and a subclass of its

class aSubclass2 =
   object 
   inherits aClass2 as super 
   ...
   method m(s:S�) = ... super#m s ...     
   ...
   end

It is easy to check that, since the method m in aClass2 takes arguments of type S, in order the following expression to be type safe

super#m s

s should be of type S or a subtype of its, that is S'<:S.

We then need the very opposite requirement that we showed to be needed before (S<:S'). Hence, also in case the ruturn types were the same (say T), we would get S → T <: S' → T, which is the very opposite requirement needed in order to have the type of the objects of our subclass to be a subtype of the type of the objects of the superclass.

The only possibility to get out of the problem is then to require S=S' when defining an overridden method (a similar natural example can be shown for which we need to require T=T').

OCaml type system imposes in fact that the type of a overridden method has to be the same as the one in the superclass.

However, we shall see that, in general, even doing that does not take us out of troubles, so providing evidence to the fact that

Type checking object-oriented languages is difficult.

Exercise: find two concrete examples producing the same problems as the schematic examples above.

Change of types of instance variables in subclasses.

(The following argument is a variant of one used by K.Bruce)
One could wonder whether it is type-safe to change the type of instance variables in a subclass (mutable variables in OCaml).
The answer, unfortunately, is no.
The explanation for such an answer is easy by simply considering that an instance variable can be used both in left-hand sides and right-hand sides of assignments. Besides, when a variable is in the right-hand side, we are looking at it as a value (the value contained in it), whereas, when it is on the left-hand side, we are looking at it as a location (a reference). So, when we say that an istance variable has type T we are actually saying that it has both type T and type Ref T (the type of locations containing values of type T.)
Let us see how the subtype relation behaves with respect to reference types.
In order to extend the definition of subtyping to reference types, we should add the following rule to those described in [FB].

T' <: T
---------------------
Ref T <: Ref T'

This means that the subtyping relation behaves contravariantly w.r.t. the Ref type constructor.
An intuitive explanation for that is the following: we can roughly interpret the subtype relation as a subset relation and a location (a reference) as a box containing elements.
So, we definitely have beetles <: animals , since in any context where an animal is expected, we can put a beetle. However, if we have a context where a cage for any animal is used, we cannot replace it with a cage for beetles! in fact in that context one could put an elephant (which is an animal) in the cage, and the elephant cannot be put in the cage if we replace that cage with a cage for beetles. This means that Ref animals <: Ref beetles .

By knowing that, let us define in OCaml-like sintax a class aClass3 and a subclass of its where the type of instance variables is changed.

class aClass =
   object 
   val mutable x:T
   val mutable y:T
       ...
   method m (arg:T)= ... x<-arg; y<-x; ... 
       ...
   end

class aSubclass3 =
   object 
   inherits aClass as super 
   val mutable x:T'
   ... 
   ...
   end

If we define

let subO = new aSubclass3;;

and evaluate the following expression, where v is a value of type T,

subO#m v

we have that the instance variables x of the subobject is used in the body of m both in the left-hand side and the right-hand side of an assignment. Hence in order the method invocation to be type-safe we need to have both T' <: T and Ref T' <: Ref T . Since Ref T' <: Ref T holds only in case T <: T' , we need to have both T' <: T and T <: T', that is T = T'.

So, no change in the type of instance variable is possible in a subclass if we wish to have a type-safe system.

Fixing conflicting requirements in O.O. type systems.

The examples we gave previously provided us with some hints about the impossibility of having all of the following:

a type-safe system;
a general and flexible notion of method overriding in subclasses;
subclasses generating subtypes.

Of course we cannot give up point (1) (as actually Java does...)
We showed that (1) is in contrast with (2). Moreover, by the example (@) of the aClass2 and the aSubClass2 we also showed that (2) is in contrast with (3).
So, it seems that in O.O. languages we need necessarily to restrict the requirements (2) and (3), trying not to lose too much flexibility and expressive power of the language.

It is easy to check that by restricting (2) by imposing method overriding to be invariant (no change of the type of an overridden method) we get (3) for free.
Moreover, the examples above seemed to hint that (1) is not in contrast with invariant method overriding.
Good! In fact in Java (version 1.4 and previous ones) we have invariant method overriding and subclasses that generate subtypes. Fantastic!
Not at all: in Java, and in general, we get that at the cost of breaking the type safeness of the system. Actually the following example of binary methods shows us that (2), even with the restriction to invariant method overriding, is in contrast with (1) (an example of this sort of problem is also shown in [FB6])

A binary method is a method whose argument has the same type of the object on which it is invoked. A classical example is an equality method.

Let us define, in a OCaml-like sintax, a class of points having an equality method

class ptEq =
    object (self)
      val mutable x = 0
      val mutable y = 0
      method get_x = x
      method get_y = y
      method setcoord new_x new_y = x <- new_x ; y <- new_y
      method equal (p2:ptEqType) = 
             self#get_x=p2#get_x && self#get_y=p2#get_y
    end

Notice how, in order to deal with binary methods in a precise way when we have structural typing, we need to deal necessarily with recursive types.
ptEqType must be in fact the name of a type such that

ptEqType = < equal : ptEqType -> bool; 
             get_x : int;  
             get_y : int;
             setcoord : int -> int -> unit >

Let us define now (always in a OCaml-like sintax) a subclass of colored points, where equal is overridden in an invariant way.

class colPtEq =
    object 
      inherit ptEq 
      val mutable color = "white"
      method get_color = color
      method setcolor new_color = color<- new_color
      method set new_x new_y new_color =
                              super#setcoord new_x new_y;
                              self#setcolor new_color
      method equal (p2:PtEqType) = 
             super#equal p2 && self#get_color = p2#get_color
    end

Immediately we realize that the expression

p2#get_color

is not type-safe, since p2 is declared to be of type PtEqType, but objects of such a type do not have the get_color method!

The problem is overcome in languages like Java by recurring to casts, in particular to those that we called downcasts in FJ.
In OCaml-like sintax such a solution would correspond to writing

class colPtEq =
    object 
      inherit ptEq 
      val mutable color = "white"
      method get_color = color
      method setcolor new_color = color<- new_color
      method set new_x new_y new_color =
                              super#setcoord new_x new_y;
                              self#setcolor new_color
      method equal (p2:ptEqType) = 
             super#equal p2 && self#get_color = (p2:>colPtEqType)#get_color
    end

But we learned from FJ that downcasts are type-UNsafe!
In fact, since the type system enables us to apply the equal method of the subclass to an element of PtEq, we can define

let mypt = new ptEq

let mycpt = new colPtEq

and then write the expression

mycpt#equal mypt

whose evaluation would raise a method-not-understood exception.

So, (1) is in contrast with invariant method overriding, in general.

In Java the problem of defining binary methods notwithstanding its invariant type system is solved by means of generics and F-bounded polymorphism (a powerful and complex typing mechanisms that should be used for other things rather than definitions of binary methods...)

Let us see how OCaml (and other languages) overcome the problem of binary methods in presence of an invariant method overriding, without losing type-safeness.
In OCaml we can refer to the type of self by writing (self:'self) after the keyword object. By doing that, we can use the type variable 'self as the type of self. (As for the names self and super, we could use any other name for the type variable 'self.)

So, in real OCaml the above definition for the ptEq class is actually

class ptEq =
    object (self:'self)
      val mutable x = 0
      val mutable y = 0
      method get_x = x
      method get_y = y
      method setcoord new_x new_y = x <- new_x ; y<- new_y
      method equal (p2:'self) = self#get_x=p2#get_x && self#get_y=p2#get_y
    end;;

The type of its objects is a recursive record type: the following one.

< equal : 'a -> bool; 
  get_x : int;                                           (%)
  get_y : int;
  setcoord : int -> int -> unit >
as 'a

which is precisely the type that the following OCaml definition associates to the name ptEq in

type ptEq =                          
< equal : ptEq -> bool; 
  get_x : int;                                            (%%)
  get_y : int;
  setcoord : int -> int -> unit >;;

In OCaml in fact we can give a name to a type by means of the keyword type.
For instance we could write

type tipodeglinteri = int;;

and then write

let id (n:tipodeglinteri) = n;;

obtaining

val id : tipodeglinteri -> tipodeglinteri = <fun>

In OCaml we can use the keyword type also to define recursive types. The recursive type defined by the definition (%%) is precisely the one of the objects of class ptEq.

In OCaml, in the type (%) the construct as 'a is a binder. It binds the type variable in the record type to represents the record type itself.
This will help us to understand the motivation of type error messages like type variable is unbound.

Now, the subclass colPtEq can be defined in real OCaml as follows:

class colPtEq =
    object (self:'self)
      inherit ptEq as super
      val mutable color = "white"
      method get_color = color
      method setcolor new_color = color<- new_color
      method set new_x new_y new_color =
                              super#setcoord new_x new_y;
                              self#setcolor new_color
      method equal (p2:'self) = 
             super#equal p2 && self#get_color = p2#get_color
    end;;

The type of its object is the following record type

< equal : 'a -> bool; 
  get_color : string; 
  get_x : int; 
  get_y : int;
  set : int -> int -> string -> unit; 
  setcolor : string -> unit;
  setcoord : int -> int -> unit >
as 'a

The name colptEq corresponds to the recursive record type that could be defined by the following definition.

type colPtEq = < equal : colPtEq -> bool; 
                      get_color : string; 
                      get_x : int; 
                      get_y : int; 
                      set : int -> int -> string -> unit;
                      setcolor : string -> unit; 
                      setcoord : int -> int -> unit >

The use of the type of the self, 'self, then enables us to have invariant method overriding (equal is of type 'a -> bool for both the superclass and the subclass) and at the same time a type-safe definition of binary methods.
But we lose (3), that is the possibility of having subclasses defining subtypes.
The example below shows in fact that colPtEq is not a subtype of ptEq, since 'self is used in contravariant position in the type of the overridden method.

Let us define a function that, taken an element of ptEq, sets its coordinates to (1,1).

let f (p:ptEq) = p#setcoord 1 1;;

Now, let us create an element of colPtEq

let cpe = new colPtEq;;

If we now evaluate

f (cpe:>ptEq);;

the interpreter returns us the following error messsage

Characters 2-13:
  f (cpe:>ptEq);;
    ^^^^^^^^^^^
Error: Type
         colPtEq =
           < equal : colPtEq -> bool; get_color : string; get_x : int;
             get_y : int; set : int -> int -> string -> unit;
             setcolor : string -> unit; setcoord : int -> int -> unit >
       is not a subtype of type
         ptEq =
           < equal : ptEq -> bool; get_x : int; get_y : int;
             setcoord : int -> int -> unit > 
Type
  ptEq =
    < equal : ptEq -> bool; get_x : int; get_y : int;
      setcoord : int -> int -> unit >
is not a subtype of type
  colPtEq =
    < equal : colPtEq -> bool; get_color : string; get_x : int; get_y : 
      int; set : int -> int -> string -> unit; setcolor : string -> unit;
      setcoord : int -> int -> unit >

So, colPtEq <: ptEq does not hold (and obviously the vice versa as well). In fact, if we had colPtEq <: ptEq, then the input type of equal would change covariantly in the subclass (while, in order to have subtyping, it should change contravariatly). The vice versa is impossible for the same motivation and for the fact that ptEq record type has less fields than colPtEq.

There is then a trade-off between being able to deal correctly with binary methods and the possibility of having subclasses coinciding with subtypes.

We have seen that invariant method overriding creates problems in case of binary methods when there is the necessity to change the type of the input of the overridden method.

Invariant method overriding create obviously also problems in case we need to change the return type of the overridden method.

Let us have a look at an example.

Cloning objects

clone is usually (as in Java) a method of the class Object.
What is the return type of clone? ObjectType.
In type systems with invariant method averriding the return type of clone then remains ObjectType even when applied to an object in a subclass.

Let us go back to our cell example and redefine it in a OCaml-like sintax assuming to have a clone method as in Java, which we wish to use to define a method doubling the self.

class cloneCell =
    object (self)
    val mutable x = 0
    method write new_x = x<-new_x
    method read = x
    method double = self#clone
    end

We immediately have a problem, since clone is a method of the Object class and hence returns an element of class Object.
Solution? DownCast.

class cloneCell =
    object 
    val mutable x = 0
    method write new_x = x<-new_x
    method read = x
    method double = (self#clone):>cloneCell
    end

Let us now define a cloning color cell such that when it gets duplicated the integer value of the duplicate is set to 0 and the color to "white".

class cloneCellcol =
    object
    inherit cloneCell as super
    val mutable color = "red"
    method readcol = color
    method writecol new_col = color<-new_col
    method double = let cc = super#double
                     in cc#write 0; (cc:>cloneCellcol)#writecol "white";cc 
    end

The downcast present in the code of the overridden double now is not possible if return types must change invariantly.

Notice that also in real OCaml it is possible to write expressions like
cc#write 0; (cc:>cloneCellcol)#setcol "white";cc
that is a sequence of expressions of type unit that can end with any expression, whose value is the returned value of the whole expression (here in fact the value returned is that of cc)

Then, we must write the code of double without the downcast. This, however, does not keep us out of troubles. In fact, if we define

let ccc = new cloneCellcol

we get a type error when we evaluate

(ccc#double)#getcol

since (ccc#double) has type cloneCell.
To overcome the problem we should then write

((ccc#double):>cloneCellcol)#getcol

In real OCaml we can avoid type-unsafe downcasts, since the problem is solved, as for binary mehod, by means of the presence of the type of self.

The clone method in OCaml is actually a function called Oo.copy whose type is

(< .. > as 'a) -> 'a

meaning that Oo.copy takes any (possibly recursive) record type as input and returns an element of the same record type.
So, in real OCaml, we can define the cloneCell class as follows

class cloneCell =
    object (self:'self)
    val mutable x = 0
    method write new_x = x<-new_x
    method read = x
    method double = Oo.copy self
    end;;

whose elements have type

< double : 'a; read : int; write : int -> unit > as 'a

cloneCell is the name of the above type, which we could also define as follows:

type cloneCell = < double : cloneCell; read : int; write : int -> unit >

In real OCaml the subclass of colored cells is

class cloneCellcol =
    object (self:'self)
    inherit cloneCell as super
    val mutable color = "red"
    method readcol = color
    method writecol new_col = color<-new_col
    method double = let cc = super#double 
                         in cc#write 0; cc#writecol "white";cc
    end;;

By evaluating this definition we get

class cloneCellcol :
  object ('a)
    val mutable color : string
    val mutable x : int
    method double : 'a
    method read1col : string
    method read : int
    method writecol : string -> unit
    method write : int -> unit
  end

The type of elements of this subclass is

< double : 'a; 
  read : int; 
  readcol : string; 
  write : int -> unit;
  writecol : string -> unit >
  as 'a

which is a type equivalent to that defined by

type cloneCellcol = < double : cloneCellcol; 
                      read : int; 
                      readcol : string; 
                      write : int -> unit;
                      writecol : string -> unit >

No problem now to define

let ccc = new cloneCellcol;;

and to evaluate

(ccc#double)#readcol;;
- : string = "white"

Besides, in the present case, the subclass defines a subtype!!
It is in fact possible to define the following function

let g (c: cloneCell) = c#read;;

obtaining

val g : cloneCell -> int = < fun >

We can now evaluate without problems both

g cc1;;

obtaining

- : int = 0

and

g (ccc1:>cloneCell);;

obtaining

- : int = 0

So, in case we override methods whose return type is the one of the self, a subclass defines a subtype, since the change is in covariant w.r.t. output.

Also Java had an invariant type system (version 1.4 and previous ones), but with respect to OCaml it could deal with binary methods and the necessity of changing result type in overridden methods only using type-unsafe casts (or using F-bounded polymorphism).
Notice that from version 1.5 on, Java enables covariant change of return types in overridded methods.

Java�s interfaces are quite similar to record types.

Java and C++ both originally supported the invariant type discipline for subclasses. C++ later loosened its restrictions to allow covariant changes to return types. So did Java after the 1.5 version.

The language Sather is a (more efficient variant) of the language Eiffel. One of the goals of the Sather design was to fix the type insecurities of Eiffel. It accomplished this by only allowing contravariant changes in parameter types in subtypes, and allowing no changes in instance variables. In Sather, inheritance from abstract classes is used to define the subtype hierarchy, while inheritance from concrete classes does not result in subtypes. Thus subtyping and inheritance are somewhat independent.

Overloading

We already met this notion in Haskell dealing with Haskell type classes.

Overloading is a relevant mechanism in O.O. enabling to give the same name to different methods.

For an example of overloading, let us go back to our point and colored point example.
The method set in the colored point class had two spatial coordinates as arguments and assigned the color "white" to the color field. It would have been nice to have the possibility of having also a third color argument, but that was not possible for an overridden method. Some languages, however allow us to have a method set in the colored point class with three arguments, as shown below in an OCaml-like sintax (OCaml does not allow that).

class point =                                    
    object 
      val mutable x = 0
      val mutable y = 0
      method get_x = x
      method get_y = y
      method set new_x new_y = x <- new_x ; y<- new_y
    end;;

class colorPoint =                                    
    object 
      inherit point as super
      val mutable color = "white"
      method get_col = color
      method set new_x new_y new_col = super#set new_x new_y; color<-new_col
    end;;

Now, however, the new set method with three argument is not an overridden method, but actually is an overloaded method, that is a completely new function with the same name. Differently from the overridden methods, the choice of which function a method call is referring to is usually performed not at run time, but statically.

Let us see another example of overloading always in an OCaml-like sintax where contains is a boolean valued method telling whether the rectangle on which the method is invoked on contains the point argument. The method name contains is overloaded. The first version takes a point as argument, the second one instead takes two coordinates.

class rectangle = 
    object
    ...
    method contains(pt:point) = ...
    ...
    method contains(x:int,y:int) = ...
    ...
    end

The overloaded methods names are treated by Java and C++ as different names.

Languages have different rules concerning overloaded method overloading. In C++, overloaded methods must be defined in the same class, while in Java, the overloading can happen when a method in a superclass is inherited in a subclass that has a method with the same name, but different signature.(K.Bruce)

In most O.O. languages with overloading the language processor statically determines what method body is to be executed by simply looking at the number and type of the arguments used

If r is of type rectangle, apt of type point and ax and ay of type int, it is easy to statically determine which method we are referring to in the following invocations

    r#contains apt 

    r#contains ax ay

In the following table the differences between overloaded and overridden methods are shown

                overloaded methods      overridden methods
              ----------------------------------------------- 
message sends | resolved statically  |  resolved at run time
              |                      |
which class?  | (tipically)  same    |    subclass                  
              |                      |
signature     |        different     |   same (or subtype)

(the signature is the type of methods)

Unfortunately the interaction between overloaded method names (static resolution) and overridden methods names (dynamic resolution) can result in Great Confusion
Let see an example (by K.Bruce) in a OCaml-like sintax

class C =
   object 
        ...
   method equals(other:C): Boolean = ...  // equals1
        ...
   end
                                             

class SC = 
    object 
    inherits C as super 
          ...
    method equals(other:C): Boolean = ...  // equals1  overridden
 
    method equals(other:SC): Boolean = ... // equals2  overloaded
           ...
    end

Clearly
SC <: C
since the overloaded method is treated as a different method.

Let us now consider the following definition of an object o and the following method invocations

let o =
   object
   val mutable c = new C
   val mutable c' = new C
   val mutable sc = new SC
   method m = c'<-(sc:>C); EQ-CODE
   end

o#m;;

where EQ-CODE can be one of the following nine invocations:

c#equals c
c#equals c'
c#equals sc

c'#equals c
c'#equals c'
c'#equals sc

sc#equals c
sc#equals c'
sc#equals sc

Which equals method is actually executed as a result of each of the sends?
Unfortunately the answers are not so intuitive.

All 3 message sends to c result in the execution of method equals1 from class C;
All 3 message sends to c' result in the execution of method equals1 in class SC;
The first two message sends to sc also result in the execution of method equals1 from class SC;
Only the last message send, sc#equals sc , results in the execution of method equals2 from class SC.

(K.Bruce): Most people get this wrong, even when they understand the rules for overloading given above. Usually the error is thinking that method equals2 is selected for some or all of the message sends to c�, and for two or more of the message sends to sc.
The key to understanding which method body is selected in these examples is to remember what is resolved statically and what is resolved dynamically.
The overloading of equals is resolved statically. That is, the selection of equals1 versus equals2 is resolved solely on the static types of the receiver and parameters. Because the type of both variables c and c� is C, when the equals message is sent to c or c�, the type system examines the methods in class C at compile time to determine if there is an appropriate method equals.
There is only one method equals in C, and it has a parameter of type C. That method is appropriate for each of the three actual parameters to equals. The actual parameters c and c� are clearly of the appropriate type.
The parameter sc is also fine because its type, SC, is a subtype of C. Thus the first 6 method calls are all to equals1.
The first two message sends to sc have parameters with static type C. This is an exact match with the signature of equals1 in SC, so they resolve to that method. The last message send has a parameter of type SC, so its best match is method equals2.
In summary, the first 8 message sends resolve statically to method equals1, while the last resolves to method equals2.
Now all we have to do is figure out which of the first 8 message sends execute the body of equals1 from class C, and which execute the body from class SC. Because we know that all of these resolve statically to method equals 1, we determine which version of equals1 is executed by determining the class that generated the receiver of the message. This is now easy because the receiver of the first 3 message sends is a value generated from class C. Therefore those 3 message sends result in the execution of the body of equals1 from class C. The receivers of the rest of the message sends are values generated from class SC. Hence all of those message sends result in the execution of method bodies from class SC.

The lesson Kim Bruce teaches us is the following:

Object-oriented languages should not support static overloading of method names.

A last word about OCaml. Someone could say OCaml to be a multi-paradigm language. Actually OCaml is a functional language with imperative and O.O. features. In order to be multi-paradigm it should provide the possibility of clearly distinguishing, if wished, the part of the program using the different paradigms.