An Exploration of OCaml

An Exploration of Objective Caml

Mat Kelly and Angel Brown

Abstract
Introduction
Syntax
Types
Semantics
Abstractions
Functional Language Concepts
Object-Oriented Concepts
Memory Management
Concluding Remarks
Tutorial
Project
Schedule & Homework

OCaml - A Practical Multi-Paradigm Language

OCaml is a programming language that extends the Caml programming language, which in turn was dervived from ML, a purely functional programming language. This extention onto the Caml language integrates object oriented concepts onto a mostly functional language.

While there are many sizable benefits to the typical functional programming philosophy, there is also no denying the impact that object-oriented problem solving methods have had on modern software engineering and programming practices. We posit that the OCaml programming language, in offering most of the expected features from the functional paradigm together with its support of object-orientation and additional flexibilities, is a highly practical language which enables the programmer to solve problems in a variety of ways.

Moreover, because OCaml contains object oriented constructs with easy to consume semantics, one can transition into the language and gain the power of functional programming without the overhead of learning a potentially unfamiliar paradigm.

We demonstrate these points with a careful exposition of the details of the language.

Introduction

The Caml programming language (Categorical Abstract Machine Language) was originally developed in France between 1987 and 1992 at the National Institute for Research in Computer Science (INRIA). It was created from ML in an effort to keep up with the flurry of automated theorem provers being developed at the time. Reliability and efficiency were among the chief goals of the language.

The object-orientaion was later added by Xavier Leroy, Jerome Vouillon, Damien Doligez and Didier Remy in 1996. The language is still maintained by INRIA today as an open-source project.

Toplevel Interactive Interpreter

OCaml has an interactive interpreter called "Toplevel" which is a "read-eval-print loop" useful for learning, quick programming and debugging. The Pervasives module in the standard library specifies functions and types which are globally available in toplevel mode. Though we typically need to qualify calls to functions in modules, the Pervasives module is perpetually "open" in toplevel mode, so we need not qualify its use.

The OCaml Toplevel and compiler may be downloaded from http://caml.inria.fr/ocaml/index.en.html.

Syntax

The OCaml Alphabet

Like most modern programming languages, OCaml has an extensive alphabet of symbols which denote variable declaration and assignment, comparison operations, names for types and functions, control structure keywords and grouping and terminal symbols. An exhaustive list may be found within the OCaml documentation, but a sufficiently comprehensive will be provided below.

Lexical Syntax

Our text defines a language's lexical syntax to be the rules governing the symbols, operators and punctuation allowed in the language. Listed below are some of the most important such rules:

Every statement ends in two semi-colons:
```
let x = 26;;
```
The "let" reserved word binds a variable name to a value. In interactive mode, the OCaml compiler will repeat the assignment once executed. (The user input follows the pound sign.)
```
		       #let x = 26;; 
		       val x : int = 26
			   
```
Notice that we do not specify that x will be bound to an integer. The compiler makes the determination upon execution of the statement.

We may also assign content to references by using either assignment operator ":=" or "<-"

			   #x := 26;; 
		       -- : unit = ()

Then if we request the contents of x we get:

			   #x;; 
		       -- : int ref {contents = 26}

References always point to another type and are modifiable¹. References can be modified through their contents attribute.
```
				#x.contents <- 20;;
				-- : unit = ()
				#x;;
				-- : int ref = {contents = 20}
				
```

Whitespace

Whitespace is not significant in OCaml unless within the context of a string.

Concrete Syntax

Mutability

Although Caml, the language from which OCaml is derived, is a functional language, OCaml doesn't necessarily have to be. The mutable construct allows the values of variables to be changed instead of being permanently bound to a certain variable. Like the reference assignment operator, mutable references are a sort of "safe pointer" to a value. Unlike other languages, however, mutable references can only be set and get (i.e. not manipulated in C-style ways) and are supplementry to the language for ease of programming, so should be used only when necessary when another construct is a more valid substitute.

Object declaration

OCaml supports all facets of object-oriented programming though it's not a language restricted to this paradigm. An example of a class structure is as follows:

			class myClass myvar = 
			object(self)
			 val myDataMember = 10;
			 method getMyDataMember = myDataMember
			 method setDataMember toThisValue = myDataMember <- toThisValue
			end;;

Recognizing the significance of all of the elements in this declaration is important.

class myClass myvar =

In this line, class signifies the upcoming declaration of a class. myClass is the name of the class, cusomizable by the programmer. myvar references parameters that are passed to the class and are required when creating a new object of the class using the "new" construct. Further parameters can be specified with space delimitation, for example:

class myClass myvar myvar2 =

...would be significant of a class that requires two parameters to be passed for instantiation to an object.

object (self)

This line is signifcant of the attribute accompanying an object of this class upon instantiation. Specifying self in this regard allows for later reference to this object's methods.

method getMyDataMember = myDataMember

Specifying a method within a class requires various elements. The method portion of this line specifies that the upcoming code is the that of a method's (as opposed to the value of a data member). getMyDataMember is a programmer-specified name for the method. Parameters can be passed to a method to be used in execution. Take, for instance, the setDataMember method's signature:

method setDataMember toThisValue

toThisValue represents the name of a parameter being passed in.

Back to the original example, everything after the equals operator represents the implementation of the method. In the case of the getMyDataMember method, the value of myDataMember is returned. With data members, unlike in Java, no reference to this (or self) is needed to access data members internally. It should be noted that all data members are private in OCaml and must be accessed using getters.

Conditionals

A few different conditional constructs exist in OCaml. The familiar if statement is easily understood as either

if boolean-condition then expression

if boolean-condition then expression else other-condition

It should be noted that because of the line-delimiting syntax (;) of OCaml, multiple statements in the then and else clauses of a conditional must be forcefully grouped together. This can be accomplished with parentheses or equivalently a begin and corresponding else.

			let x = 4 in
			if x = 3 then
			 print_endline "x = 3"
			else begin
			 print_endline "x != 3";
			 print_endline "error!";
			end;;

...or equivalently:

			let x = 4 in
			if x = 3 then
			 print_endline "x = 3"
			else (
			 print_endline "x != 3";
			 print_endline "error!";
			);;

Looping

A very primitive form of looping is available in OCaml. These constructs were added to assist those familiar with other languages' style of looping to get a jump-start in creating programs. However, OCaml best practices encourage "unrolling" loops into recursive function calls.

Unlike in other languages, OCaml has no support for breaking from loops (e.g. break) outside of the initially established condition. The for and while loop can be defined using any of the following constructs.

			for variable = start [to/downto] end do
			 expression
			done

and the while construct:

			while boolean-condition do
			 expression
			done

Comments

Comments in OCaml are delimited by (* and *) character sequences and can be multi-line. This syntax are very similar to those in original C (i.e. /* ... */ ). Unlike C, however, OCaml does not have native support (without the use of language extensions) for single-line comments like C99, C++ and Java among others. Unlike some languages, though, comments can be nested in order to comment out regions of code:

			(* Fix this:
			
			(*this syntax don't work within a class construct *)
			let foo x = new myObj 
		
			*)

Reserved Words

OCaml has a set of reserved words words, characters, character sequences and identifiers. Unlike other languages, however, most reserved words can be re-mapped to function differently. The following is an exhaustive list of reserved words and their intended usage.

Reserved Keywords (and select examples)

and - used to define mutually recursive functions and objects

				let rec foo i =
					match i with
						-1 -> true
						| x -> bar (i-1)
					and bar i =
						match i with
							-1 -> false
							| x -> foo (x-1)
				;;

				class lorem =
				object (self)
				 val i = (new ipsum)
				end
				and ipsum =
				object (self)
				 val l = (new lorem)
				end;;

as
assert
asr - unsigned bitwise right shift
begin - syntactic sugar for open parenthesis
class
constraint

do - beginning of scope definition for a loop

				for j=0 to 20 do
				 Printf.printf "%d\n" j;
				done;;

				let counter = ref 0 in
				while !counter <= 20 do
				 Printf.printf "%d\n" !counter;
				 counter := (!counter + 1)
				done;;

done - closing of scope definition for a loop

downto - decrement signifier for for loops

				for i = 9 downto 0 do
				 Printf.printf "%d\n" i; 
				done;;
				Printf.printf "%s\n" "We have liftoff!";;

else
end - syntactic sugar for close parenthesis
exception
external
false
for
fun - inline function definer, e.g. let sum = fun x y -> x + y;;
function
functor

if - the beginnings of a simple conditional construct

				let x = 987654321 in
				if (x mod 7)=0 then Printf.printf "%d %s\n" x "is a multiple of 7"
				else (Printf.printf "%d %s\n" x "is not a multiple of 7");;

in
include

inherit - means for a class to extend another class

				class parentClass =
				object(self)
				 method doSomethingElse = print_string "Something else!"
				end;;
				
				class childClass =
				object (self)
				 inherit parentClass
				 method doSomething = print_string "Something!"
				end;;

				let c = new childClass;;
				c#doSomething;;
				c#doSomethingElse;;

initializer - method within an object definition that serves as a constructor
land - bitwise and
lazy
let - the means to declare a variable
lor - bitwise or
lsl - bitwise left shift
lsr - bitwise right shift
lxor - bitwise xor
match
method - object method invocation (via object#method)

mod - the modulo operator

					let x = 10 mod 4;;
					val x : int = 2

module - module declaration, begin definition
mutable
new - object creation
object
of
open - import
or
private

rec - used to signify that a function is recursive

				let rec range a b =
				  if a > b then []
				  else a :: range (a+1) b
				  ;;

sig
struct
then
to
true
try
type
val
virtual
when

while - continuously evaluate following condition while condition is true, e.g.

					let i = ref 0 in
					while (!i) < 100 do
					 Printf.printf "%d\n" !i;
					 i := 1 + (!i)
					done;;

with

Reserved character sequences

!= - inequality (shallow)
# - object method call delimiter (e.g. object#method)
&
&&
'
(
)
*
+
,
-
-
->
.
..
:
::
:=
:>
;
;;
<
<-
=
>
>]
>}
?
??
[
[<
[>
[|
]
_
`
{
{<
|
|]
}
~

Types

OCaml, like its ML-derived counterparts, is staticly type. Because of this, OCaml's runtime performance is not hindered with the overhead of type checking giving the language a slight advantage in speed. In addition to static typing, the language also possesses type inference, as do most functional languages.

OCaml contains an extensive set of native data types including integers, booleans, string and arrays among others. No implicit casting exists in OCaml. However, pervasives and conversion functions exist for most primitives and custom coercisions for object types. These functions can be used to convert one data type to another.

Basic Types

OCaml supports all of the standard basic types. They are all declared using the 'let' keyword and no type declaration is required.

Float	Int	Boolean
# let x = 2.0;;	# let x = 2;;	# let x = true;;

Char	String	Array
# let x = 'a';;	# let x = .some string.;;	# let x = [\|1; 2; 3\|];

List	Tuple	Unit
# let x = [1; 2; 3];;	# let x = (.Fred., 42, 'C');;	# let x = ();;

Lists

Lists play a large role in data manipulation in Objective Caml in terms of organizing data into collections. All entities within a list must be of the same data type. For example,

		let myList = [3; "a string"];;

...will not compile in OCaml. An empty list can be specified with []. Specifying this will result in a list bound to no data type. The data type for a list is defined with the first element.

	# [];;
	- : 'a list = []
	# [3];;
	- : int list = [3]
	# ["foo"];;
	- : string list = ["foo"]
	# class foo = object end;; (* define the class foo *)
	class foo : object end;
	# [(new foo)];;
	- : foo list = [<obj>]

Various functions can be used to manipulate lists. An exhaustive list can be found in the documentation for the list module in the official manual.

Arrays

If we need a mutable collection of one type, we use an array. OCaml arrays also permit easy random access which pairs nicely with their mutability. Like lists, all elements in an array must be of the same type.

	          # let myArray = [|1; 2; 3|];;
		  - : int array = [|1; 2; 3|]

Strings

OCaml strings are simply character arrays. As such, they provide easy random access and are mutable - another language design choice which favors practicality over funcational fundamentalism.

	          # let myString = "my string";;
		  val myString : string = "my string"

To change the value of a character, use a combination of dot and array notation, and the arrow for assignment.

	          # myString.0 = "t;;
		  - : char = 't'
		  # myString;;
		  val myString : string = "ty string"

User-defined Types (Records)

OCaml also permits the use of user-defined types called Records. A Record is a compound type, defined by a collection of parameters with explicitly defined types.

	          # type triple = {val1: int; val2: float; val3: string};;

Once the type is defined, OCaml will recognize and infer the type using the given parameter names...

	          # let three = {val1=3; val2=3.0; val3="three"};;
		  val three : triple = {val1 = 3; val2 = 3.; val3 = "three"}

We can then use dot notation to access the properties of a variable of this type.

	          # three.val3;;
		  - : string = "three"

Mutable Reference Types

Mutable references are yet another example of OCaml's preference for utility and practicality over strict adherence to paradigm standards. Using a mutable reference, the programmer may allocate a slot in memory of a certain type and later change the contents of that location in memory, so long as the new value is of the same type.

Polymorphic Types

The OCaml compiler will infer a type for every variable and function definition, even if the definition could possibly identify more than one type. In this event, the compiler assigns a polymorphic type to the definition.

In this example, we define a function which takes in two parameters and outputs a tuple containing them. Since there are no restrictions on the datatypes that may be entered into a tuple, this function may take any dataype for each of the input parameters.

	     # let tupleMaker x y = (x, y);;
	     - : tupleMaker 'a 'b -> 'a * 'b

The compiler infers the type of input parameters to be polymorphic types.

Semantics

Assignment Semantics

Values are assigned to variables differently depending on the context. In variables not bound to object definitions, variables can be assigned using the let semantics.

	let x = 3;;

Variables being used as data members can be set using the <- semantics.

	class x =
	object
	 val mutable m = 4
		method setM newM = m <- newM
	end;;

The ability to create mutable variables is one example of how OCaml strays from the traditions of pure functional languages in an effort to make the language more flexible and useful.

Input/Output Semantics

Files can be read and written using OCaml's stream semantics. Though the means to interact with files is represented in the official manual, understanding the usage is a bit cryptic if only using this resource.

In the Pervasives module exists a function open_in, which takes a string (the file name) as a parameter and returns a reference to an "in_channel" file stream. This channel can be read from using the input_line function, also in the Pervasives module. A usage example of a regurgitate program is

let rec input_lines file = 
(* This function recursively converts a file's contents to a list *)
   match try [input_line file] with End_of_file -> [] with
      [] -> []
      | line -> line @ input_lines file

let inFile = 
try Sys.argv.(1) with noFileName -> "Please specify a filename for input for the second argument.";;

let inputChannel = open_in inFile;;
let stringList = (input_lines inputChannel);;

for i=0 to (List.length stringList)-1 do
 Printf.printf "%s\n" (List.nth stringList i);
done;

Continuing on with the array in memory from the file read example, we can also see that writing to a file uses similar means.

File write example

let outputStream = open_out "newFile.ml" in    
for i=0 to (List.length stringList)-1 do
	Printf.fprintf outputStream "%s\n" (List.nth stringList i);  
done;
close_out outputStream;;

Exception Handling

New types of exceptions can be declared in OCaml with optional parameters.

			exception <Name>;;
			exception <Name>;; of <type>

The name of an exception must begin with a capital letter. Exceptions can be called with the following syntax:

raise (<name> arguments);;

For example, if an exception is defined as having two parameters

exception ConcatNotAllowed of String*int

It can be invoked with a try-catch block:

			let concatThese (s, i) =
			 try x <- s + i
			 with ConcatNotAllowed (s, i) ->
			  (print_string s;
			  print_string " and ";
			  print_int i;
			  print_string " cannot be concatenated.";
			  print_newline ();
			  []);;

Abstractions

Objects

One of the key features to OCaml that its predecesors don't possess is the ability to use object oriented constructs.

		 class className =
		  object (internalReferenceToSelf)
		  val aDataMember = "Data Member Value"
		  val mutable anotherDataMember = "A Data Member Whose Value Can Be Changed"
		  method aMethodForClassName = ()
		  method aMethodWithAParameter (aParameter:int) = ()
		 end;;

className must be a string and begin with a lowercase letter.
internalReferenceToSelf is used to refer to an instance of the object and can be any string that begins with a lower case letter. Traditionally, this variable is named self
val is a keyword while the variable name must be a string that begins with a lowercase letter. The type of the data member is inferred at compile time.
aMethodForClassName's return data type will be checked at compile time. In this case, the type is unit.
aParameter's data type should be specified if it cannot be inferred from the context of the method's contents.

An interesting anomaly in OCaml is the concept of immediate objects, which are class-less objects. Though the concept seems counter intuitive from a traditional object-oriented standpoint, the existence of immediate objects allows objects to be defined within expressions, unlike their class-based object counterparts.

			let foo =
			 object
			  val mutable dmFoo = 0
			  method methodFoo = print_endline "Foo!"
			  method conParam p = print_string p
			 end;;

Modules

Every compiled OCaml program is considered to be in an OCaml module. Modules allow scope restriction, visibility restriction, encapsulation and a number of other features to exist in the language. As an exampe, a new module can be created in the myModule.ml file:

		let helloWorld = print_endline "Hello world!";;

...then in tester.ml:

		open MyModule;;

		MyModule.helloWorld;;

Compiling tester.ml with ocamlc -o tester tester.ml results in the error "Unbound module MyModule" being produced and the compilation failing. This is due to the dependeny of tester on the MyModule module, which hasn't been created yet.

To create the MyModule module, first compile moModule.ml with ocamlc -o myModule myModule.ml. Then, to compile tester.ml with the resolved dependency, type ocamlc -o tester myModule.cmo tester.ml. Explicitly including the object code myModule.cmo is tester.ml's compilation will assure accessibility when the module is imported.

Upon execution of tester with ./tester, "Hello World!" is printed, as expected. It should be noted that including modules in a file requires the module name to be capitalized, per line 1 of tester.ml. This is in contrast to the requirement of class names being required to begin with a lower case letter. Additionally, the MyModule immediately preceeding the helloWorld function call is superfluous in this case, but would be necessary if two module contained the same function.

Additional to the implicit module construct is the declaration of module explicitly. This technique can be used to regulate the scope of a program further and thus provide more cohesive encapsulation of code.

Re-using myModule.ml from above, we can define a submodule to restrict the accessibility of the helloWorld function to require more strict usage to call the function.

		module SubmoduleTest :
		sig
		 val helloWorld : unit -> unit
		 val addOnePointOne : int -> float
		end =
		struct
		 let helloWorld () = print_endline "Hello world!"
		 let addOnePointOne (i:int) = (float_of_int i) +. 0.1
		end

Additionally, revise tester.ml to access the module without explicitly opening it but rather, including it in the compilation command as before.

		MyModule.SubmoduleTest.helloWorld ();;

Here, the submodule's signature is explicitly defined before being used.

Functors

Perhaps the most abstract OCaml abstraction is the functor. A functor is a parametrization of a module by another module. A simple, commonly cited example is the Set.Make functor, which parametrizes the Set module found in the standard OCaml library, by the input module. Here we give an example which creates a module representing a set of integers:

		  module IntSet = Set.Make(struct type t = int
		                              let compare = compare
		                          end)

We are required to provide an implementation of the compare function since the Set module relies heavily on the implementation of this function.

Functional Language Concepts

The defining characteristic of functional languages is their treatment of functions. In a purely functional language, a function's output depends only on the input, rather than addtionally depending on the state and environment as in imperative languages. Indeed, the elimination of side effects is one of the functional paradigm's key attractions. OCaml preserves many of the traditional concepts from the functional paradigm regarding function handling and definition, making it just as attractive to functional programmers as the more typical functional languages.

Higher-Order Functions

We often use the phrase "first-class" when speaking about functions in functional lanugages, meaning that we can pass functions are arguments, return functions as return types and store them in data structures. Functions which take functions as parameters are called higher-order functions and are supported in OCaml as well as in most functional languages.

A typical, simple example of such a function is the map function in the List module. The map function takes as input both a list of a given type, and a function on that type, and applies the input function to every element in the list.

		     # let increment x = x + 1
		          in List.map increment [1; 2; 3];;
			  - : int list = [2; 3; 4]

Currying Functions

Since we are defining functions in OCaml using the mathematical definition, it makes sense that we can expect that they behave as mathematical functions. For example, if make a constant substitution for a parameter in a function of two variables, we get a function of one variable.

For example, this function sums two input integers:

		     # let sumInts x y = x + y;;

Substituting a constant for one parameter...

		     # sumInts 2;;
		     - : int -> int

The compiler recognizes this as a function which takes one integer as input.

Pattern Matching

A common expectation of modern functional languages is the ability to make decisions based on matching patterns in structure and value.

In the most basic example, we can use pattern matching sort of like a case or switch statement...

		     # let is_nothing x =
		         | [] -> true
			 | _ -> false;;

But we may also using pattern matching in more complex scenarios such as in guarded expressions

		     # let

Object-Oriented Concepts

Encapsulation

One of the most fundamental concepts in object-oriented programming is encapsulation. That is, data should only be accessible in precisely the situations necessary. Object-oriented programmers use this principle as a basic tenet in their designs. OCaml encourages encapsulation using both modules and classes.

Inheritance

Another key principle in the object-oriented paradigm is inheritance. This is perhaps the main selling point for object-orientation because it scores high marks for code reuse, maintainability and organization. OCaml not only supports inheritance, but it also permits multiple inheritance, a feature not included in some of the most popular object oriented languages.

		      class virtual foo =
		      object (self)
		          method virtual doBaz : unit
			  method doBaz2 print endline "Doing parent Baz2"		
			  method doBaz3 print endline "Doing parent Baz3"		
		      end;;

		      class bar =
		      object (self)
		          inherit foo as myFoo
			  method doBaz = self#doBaz2; myFoo#doBaz2; self#doBaz3;
		      myFoo#doBaz3
		          method doBaz3 = print endline "Doing child Baz3;"
		      end;;

		      let b = new bar;;
		      b#doBaz;;

When compiled and run, the above will output the following:

		     Doing parent Baz2
		     Doing parent Baz2
		     Doing child Baz3
		     Doing parent Baz3

Memory Management

Memory Use and Functional Languages

The basic types in OCaml (and in typical functional languages) are immutable. This means that every time a named item changes, another space in memory is allocated. It is not hard to imagine a program, perhaps one with heavy recursion involved, that could exhaust a machine's resources very quickly if there were no mechanism to manage all this memory allocation. The OCaml compiler optimizes for tail recursion, so in fact, in programs heavy on recursion, if we use tail recursion, OCaml actually manages resources more efficiently than object-oriented languages which don't have this feature. For all other memory allocation, though, we must rely on the OCaml Garbage Collector.

The OCaml Garbage Collector

OCaml uses a combination generational/reference counting garbage collection scheme for memory management. That is, the speed with which an object is cleaned is determined by the length of time it has been around. In OCaml, this is implemented by partitioning the heap into sections called the minor heap and the major heap. Every object allocated is first put into the minor heap. If, after a certain number of garbage collection cycles, the object still has references to it, the object is moved into the major heap. Garbage collection occurs on the major heap less frequently, lessening the number of times we have to check on those objects, and therefore, reducing garbage collection cost in general. This method of garbage collection is often touted as more efficient than reference counting. Because a reference counting garbage collector must also maintain information about the number of references each object has, an update to the 'garbage collection system' is required every time a reference is added or deleted, even if there is no net change in an object's reference count. The OCaml garbage collector, on the other hand, only requires resources when memory is allocated or when the programmer requests it.

The generational garbage collection design is based on the idea that in typical programs, there are objects which live for a very short period of time, and there are objects whose lifespan is generally much longer. So if we can approximate that an object which has lived for twenty garbage collection cycles is likely to stick around for twenty more, then it makes sense not to check that object again for another twenty cycles!

Garbage collection runs in process, so an application must wait for garbage collection to occur before continuing. Collection occurs automatically only when an allocation request is made and the minor heap is full. However, the garbage collector is accessible to the programmer via the Gc module, and a collection cycle may be called in code. Other highlights of this module include the ability to adjust the size of the minor heap versus the major heap, and to require the garbage collector to report every time it runs or to print statistics about the contents in memory.

Weak Pointers

Another useful module which helps the programmer leverage OCaml's garbage collector is the Weak module. Using this module the programmer may create a 'weak' pointer to an object, indicating to the garbage collector that the object may be cleared if necessary, but that the object should remain in memory if possible. An interesting use for this is described in this OCaml tutorial. The tutorial suggests that one might want to create weak pointers to keep a cache of recently used data objects, for example. Every record in the cached collection is given a weak pointer, which turns into a normal pointer if the record is retrieved for use.

Disadvantages

The obvious disadvantage to OCaml's garbage collection method is that some objects may be kept around longer than necessary after they've been moved to the major heap. Also, amount of immediately available space for allocation is smaller due to the fact that some portion of the heap is dedicated to the major heap. If our program creates a high number of shortlived objects, we may never use the major heap, but we would still be limited by its existance. Of course, if we know this up front, we can use the Gc module to allocate a larger space for the minor heap, but requiring direct manipulation of the garbage collection system is contrary to the purpose of the service. That is, the garbage collector is intended to allow the programmer to focus on programming instead of the allocation of minor versus major heap space.

Concluding Remarks

Derived from ML, OCaml has many of the features that make the functional language paradigm so attractive. In fact, commentary in some sources state that it is quite possible to simply ignore the "extras" and program in OCaml in a purely functional manner. Indeed, some sources advocate ignoring the mutable keyword, stating that it skirts the "functional-ness" of the language. One could argue, however, that this is precisely the point of the keyword. OCaml's features and language decisions often choose practicality over paradigm rigidity, trusting that a knowledgable programmer should be given all the tools he or she might need to solve problems, rather than forcing a particular solution type by requiring strict adherence to a certain programming philosophy. By incorporating some key flexibilities into the functional foundation, and by creating solid support for the object-oriented constructs and principles to which most modern programmers have become accustomed, OCaml has gained a well-deserved reputation of versatility among its users. And though it is still a relatively obscure language when compared to the most popular object-oriented languages today, the recent interest in functional programming may be OCaml's ticket to fame.

Practice Assignment

Note: This assignment is meant to practice OCaml as documented on this website. If you don't feel prepared or are having trouble, feel free to first give the tutorial a run-through.

Step 1

Use the Open command to include the tutorial modules. Create an instance of the tutorial class and call the step1 method for the instance.

With the module's .cmo and .cmi files in the same directory as your program, compile the application.

Step 2

Create a new class that inherits from the abstractTutorial class in the tutorial module. The abstractTutorial class take a tutorial object and an int as a parameter, so be sure to include this in the inherit statement.

Because the abstractTutorial class is virtual, its method step2 must be defined in your class. Using the compiler, determine the step2 method's parameters and in your step2's body, call the function step2a, whose functionality is defined in the abstractTutorial class.

Step 3a - Exceptions

Create a new method on your newly created class. Attempt to call the step2 method of your instance of the tutorial class with the integer 999 as a parameter.

Step 3b

Call step2 again with the same integer as a parameter.

Step 3c

Raise the exception Step3Exception with a string as the parameter.

Step 3d

In the handler, print the string resulting from a function call for step3c, as defined in the tutorial class.

Step 3e

From outside of your class definition, call your newly created method.

Step 4 - Polymorphic variant types

Create a function that takes one parameter with that parameter's type being a string.

Step 4b

Setup a call to your function using the value of the string resulting from a call to retrieveStep5Key for your tutorial object.

Step 4c

Within your function, use pattern matching to determine the value obtained from the call to retrieveStep5Key. Pass this value to your tutorial object's validateStep5 method. Take note that this function accepts a string as the first and only parameter.

Step 5

Input and output in OCaml is performed using channels. Create an output channel using 'open_out' as follows:

let myFile = open_out "myFile.txt";;

You may write to the output channel using 'output_string'

output_string myFile "This is my new output file.";;

Close the output channel to write the channel to the file.

close_out myFile;;

Open an input channel using 'open_in' as follows:

let readIn = open_in "myFile.txt";;

Now that we've given you a good start, research a little more about File I/O in OCaml and write a sequence of functions which write to a file and then read the contents of that file.

OCaml Environment Setup

OCaml is available for download from http://caml.inria.fr/download.en.html

There are installation instructions provided for Windows and Unix based operating systems. Once installed, type ocaml in a terminal window to start the interactive 'top level' OCaml environment. If top level starts up, you will see something like:

Objective Caml version 3.10.0
#

Primitive Types

The basic OCaml types are int, float, bool, char, string and unit (which is a return type similar to 'void' in C). When naming values, we do not have to declare any of these types explicitly thanks to OCaml's type inference engine. So to name a character for example, we use the let keyword as follows:

#let x = 'a';;

In the interactive Top Level environment, the compiler will respond to the above statement with:

val x : char = 'a'

This tells us that the name x is associated with type char and holds the value 'a'. The statement

#let x = 2;;

will define an integer type with the value 2. If we intend to define a float, we use the statement

#let x = 2.0;;

This is important to note especially because OCaml is statically typed, meaning we will not be able to simply convert x from an int to a float after it is defined. On a related note, the standard arithmetic operators are not overloaded in OCaml as they are in many languages. Addition of integers is expressed using the symbol '+' while the symbol '+.' is used to express addition of floats.

OCaml will not implicitly cast an integer to a float, so

#2.0 +. 2;;

yields the type error

This expression has type float but is here used with type float (The language was developed and written in French. Some of the translations sound like Yoda speak.) We can, however, explicitly cast an int to a float and vice versa using the built in functions float_of_int and int_of_float respectively.

Addition of floats:

#2.0 +. (float_of_int 2);;
- : float = 4.

Addition of integers:

#(int_of_float 2.0) + 2;;
: int = 4

Other casting functions: char_of_int int_of_char string_of_int

The 'of' naming convention may seem odd at first, but it makes sense when we consider that these functions are mathematically defined functions. That is, when we say "a function f of a set X" we are speaking of a function whose domain is the set X.

Also note that no parentheses or brackets are required around parameters in function calls. If more than one parameter is passed into a function, a space is used as a delimiter.

Let It Be

The let keyword is also used to define functions. Since OCaml infers type, we don't specify the types of input parameters. The parameter types are inferred from the function definition.

#let square x =
x * x;;
val square : int ?  int

The type inference engine knew that the input and output are both of type int because of the * operator. In the case that no type specific type is required in the function definition, any type of input type is permitted. This function takes two input parameters and returns a list containing them:

#let toList x y = [x; y];;
val toList : 'a -> 'a -> 'a list =

No return statement is required. Instead, the last expression in the function definition is returned as output.

We can also define a local substitution of sorts using the combination let = in. This statement essentially means 'substitute for for every following occurrence of until the symbol ;;'. For example:

#let average x y z =
let sum =
x +. y +. z in sum /. 3.0;;

In the last line, sum is replaced with x +. y +. z to compute the average.

References

So far all of the memory allocations we've made have been immutable. If we need to be able to change a variable's value, we need to name a reference.

# let x = ref 5;;

Now we have a named reference which holds the address of an integer with the value 5. If we "change the value" of x

# x := 20;;

we are actually changing the address held by x. The reference now holds the address of an integer with the value 20.

Recursion

Recursive functions are essential to the functional programming paradigm. To define a recursive function, you must use the rec keyword in the function definition as follows:

# let rec fib n = if n<=0 then 0 else begin
if n=1 then 1 else (fib (n-1)) + (fib (n-2))
end;;

This function returns the nth Fibonacci number. We haven't addressed if...then statements yet, but they are similar to what you have probably seen before. Notice that we don't any notation to group simple statements, and that we use begin and end to group more than one statement in an if or else block.

Modules

All compiled code is contained in a module. Let's escape the top level environment (ctrl+z in Unix, probably ctrl+c in Windows) and create a module. Using your favorite text editor, create a file named helloWorld.ml. Then enter the following code:

print_string "Hello World"

In a command-line environment, navigate to the file you just created and type

(If you're using Windows...)

> ocamlc -o helloWorld.exe helloWorld.ml

(If you're using Unix...)

>ocamlc -o helloWorld helloWorld.ml

Then call the executable you just created.

>helloWorld.exe

>helloWorld

You should receive the expected one line of output. Congratulations on creating your first OCaml module.

Lots of modules have been made available to you via your OCaml installation. To use another module in your code, you can either use the open keyword at the top of the file, or simply use dot notation to specify the module name and method to call. For example, we can create simple graphical displays using the Graphics module. (In a windows environment, you need to first create a custom top level by running this command in a command window: ocamlmktop -o ocaml-graphics graphics.cma).

Create a file named graphicsTest.ml and open it in a text editor. First, we want to include the graphics module, and show a graph:

open Graphics;;

open_graph " 640x480";;

Now we'll draw the simplest recognizable figure I could think of. The draw_circle method takes three arguments - the first two are the center in x,y coordinates (where the origin is the bottom left of the graph) and the last argument is the radius. The arc is similar, but with a vertical and horizontal radius, and a beginning and ending angle in degrees. (You can read the documentation for the graphics module at http://caml.inria.fr/pub/docs/manual-ocaml/libref/Graphics.html).

draw_circle 250 250 20; draw_circle 350 250 20; draw_circle 300 200 120; draw_arc 300 200 60 60 180 360;

read_line ();;

To compile this, we need to include the graphics module.

On Windows:

ocamlc graphics.cma graphicsTest.ml -o graphicsTest.exe

Linux/Unix:

ocamlc graphics.cma graphicsTest.ml -o graphicsTest

Then just run the output file to see the graph!

Final Project: Little Languages

Throughout the semester we are building a little languages interpreter for online quizzes. The current prototype for our interpreter contains internal documentation on our progress in completing the project to specifications.

Creating a quiz

If you have not yet done so, please download the quiz program interpreter from the this page.

In order to create a quiz, in your favorite text editor, give the quiz a brief introductory paragraph, then separate the introduction from the quiz body with a blank line. Following the blank line, specify the text of the question and follow it with either a checkbox answer (multiple selections permitted), a radio button answer (only one option) or a short answer. We'll give more information about the answer syntax in the next section. Following the answer block, specify the correct answer(s) with the keyword "Ans" followed by the number(s) of the answer(s). Then specify the point value for the question on the next line with an integer followed by the keyword "Points". Separate the each question and answer block with a blank line. You may specify as many questions as you like.

Answer syntax

Checkbox Answer

If the answer to your question is a set, you might want to use the checkbox answer style. This allows you to present the quiz taker with several answer options, some subset of which being the correct answer.

To specify an unchecked checkbox option, on a new line type a right square bracket ("]") followed by a space and the answer text. You may wish to have an option default to checked. This is specified the same way, but with an x preceding the right square bracket. To specify the set of correct answers, use the "Ans" keyword followed by the number of each correct answer separated by spaces. Don't forget to include the number of points the answer is worth using an integer followed by the "Points" keyword. A complete checkbox answer might look like this:
- ] incorrect answer
- x] correct answer 1
- ] correct answer 2
- ] another incorrect answer
- Ans 2 4
- 2 Points
Radio Button Answer

To indicate to the quiz taker that only one answer option is correct, use the radio button answer style.

Answer options are specified with the letter "o" followed by a blank space and the answer text. You may wish to default one option to be checked. To do this, follow the "o" with a vertical bar "|". An example is as follows:
- o| incorrect answer 1
- o correct answer
- o incorrect answer 2
- o incorrect answer 3
- Ans 2
- 2 Points
Short Answer

To require the quiz taker to input the entire text of an answer, you should use a short answer format. To specify this, simply type a right angle bracket, followed by a space and enclosed in parentheses, some text to prompt the quiz taker - then on a new line use the keyword "Ans" to specify the expected text answer. Please note that the quiz taker must input exactly the answer you specify. Your short answer specification might look like this:
- > (Please input your answer here)
- Ans answer text
- 3 Points

Running a quiz program

The first time you use the quiz program, you will need to compile the interpreter before running your quiz program through it. Since the interpreter is written in OCaml, you will need to have an OCaml compiler installed on your machine. Please see the tutorial section of this website for more instruction on installing OCaml.

Once you have OCaml installed, open a command prompt and navigate to the folder containing the quiz program interpreter. To compile the interpreter type the following

ocamlc -o ll str.cma ll.ml

Provided no errors occur, you have successfully compiled the interpreter and you are now ready to run your quiz program. To run the program, type the following at the command prompt

ocamlrun ll "myQuiz.txt"

...where "myQuiz.txt" is name of your quiz program. You should expect the interpreter to display the text of questions and the answer options (or just a command prompt for short answer) and wait for the quiz taker to specify the input. The quiz taker will recieve a score after answering the last question.

Little Languages Grammar Rules (EBNF)

Program → Paragraph CRLF Questions
Paragraph → {a-zA-Z0-9}
CRLF → \r\n
Answers → ShortAnswer | RadioAnswer | CheckboxAnswer
ShortAnswer → > CRLF CRLF
Questions → {Paragraph CRLF Answers CRLF CorrectAnswers Points Questions} | Paragraph CRLF Answers
CheckboxAnswer → [ ]{Paragraph} | x]{Paragraph}]⁺ CRLF CheckboxAnswer | ]{Paragraph} | x]{Paragraph}
RadioAnswer_Unselected → {Paragraph} CRLF RadioAnswer_Unselected | {Paragraph} CRLF RadioAnswer_Selected
RadioAnswer_Selected → {Paragraph} CRLF RadioAnswer_Selected | {Paragraph} CRLF
CorrectAnswers → Ans {a-zA-Z0-9}⁺ CRLF
Points → {0-9}⁺ PointsCRLF CRLF

For the sake of representation, the syntax for the above grammar rules (e.g. the | for represnet "or") is represented in blue text. This scheme has been used because symbols usually used to convey a grammar are significant to the language itself.

An example "program" for the Little Languages interpreter can be accessed below the prototype code.

Schedule & Homework

This section serves as a repository for deliverables and assignments for CSCI618 - Programming Languages.

Schedule
Homework

Projected Schedule

As project on January 19, 2009, we plan on having the following deliverables published by their respective dates.

Syntax draft 1 due - January 28
Website frame published - January 28
Naming draft 1 due - February 4
Types draft 1 due - February 11
Syntax draft 2 due - February 11
Semantics draft 1 due - February 18
Naming draft 2 due - February 18
Syntax draft 3 due - February 18
Semantics draft 2 due - February 25
Naming draft 3 due - February 25
Tutorial draft 1 due - February 25
Types draft 2 due - February 25
PHASE 1 DUE | Examples: Simple I/O Scanner/Parser/LittleLanguages Prototype
Key Abstractions draft 1 due - March 4
First Iteration of project completed - March 11
Memory Management draft 1 due - March 11
Key abstractions draft 2 due - March 11
Memory Management draft 2 due - March 18
Key abstractions draft 3 due - March 18
Website updated with all submissions - March 18
PHASE 2 DUE
Second Iteration of project completed - April 1
Final iteration of project completed - April 22
FINAL PHASE DUE

Homework

Below are the homework assignments submitted for the class for which this site was built.

OCaml - A Practical Multi-Paradigm Language

Introduction

Toplevel Interactive Interpreter

Syntax

The OCaml Alphabet

Lexical Syntax

Whitespace

Concrete Syntax

Mutability

Object declaration

Conditionals

Looping

Comments

Reserved Words

Reserved Keywords (and select examples)

Reserved character sequences

Types

Basic Types

Lists

Arrays

Strings

User-defined Types (Records)

Mutable Reference Types

Polymorphic Types

Semantics

Assignment Semantics

Input/Output Semantics

File write example

Exception Handling

Abstractions

Objects

Modules

Functors

Functional Language Concepts

Higher-Order Functions

Currying Functions

Pattern Matching

Object-Oriented Concepts

Encapsulation

Inheritance

Memory Management

Memory Use and Functional Languages

The OCaml Garbage Collector

Weak Pointers

Disadvantages

Concluding Remarks

Practice Assignment

Step 1

Step 2

Step 3a - Exceptions

Step 3b

Step 3c

Step 3d

Step 3e

Step 4 - Polymorphic variant types

Step 4b

Step 4c

Step 5

Basic Skills Tutorial (PDF)

Final Project: Little Languages

Creating a quiz

Answer syntax

Checkbox Answer

Radio Button Answer

Short Answer

Running a quiz program

Little Languages Grammar Rules (EBNF)

Schedule & Homework

Homework 2

Homework 3

Homework 4 and 5

Homework 6

Homework 7 and 8

Homework 9

Homework 10

Homework 11

Homework - Scala

Homework - Ruby

Homework - Python

Homework - Lua