The goal of this project is to write in OCaml a compiler translating from Joos (a subset of Java) to Java bytecode.

The compiler project is split into a number of phases which are run on the input program in sequence to produce the final output. These phases are grouped into a number of hand-in assignments. The responsibilities of each phase, and the interfaces between them, are precisely defined in the hand-in assignment descriptions.

The Joos languages

Three different subsets of Java are defined: Joos 0, Joos 1 and Joos 2. These are ordered such that Joos 0 is a subset of Joos 1, which is a subset of Joos 2, which is a subset of Java 1.3. The languages are defined by the Java features which they contain.

Joos 0 is a very simple language. A compiler for Joos 0 is given as an example of a complete compiler, which can be used as inspiration for the project.

Joos 1 defines the minimum set of language features your compiler must support. A skeleton for a Joos 1 compiler is given. Your job is to fill out the (many) missing parts to obtain a complete Joos 1 compiler.

Joos 2 defines a set of additional language features you can implement support for in your compiler in order to earn extra credit. An executable reference implementation of Joos 2 is available.

Commandline arguments

The compiler takes (along with the list of source files to compile) the following commandline arguments:

Support modules

In addition to the skeleton phases, the compiler skeleton contains a number of support modules which you are advised to familiarize yourself with:

Support tools

You will be using the following tools for the project:

The testing framework

On your group page, you have the possibility of testing a selection of compiler phases on the supplied test suite of Java programs. This will build a compiler using your implementation of the selected phases and the reference implementation for the rest and run this compiler on all the test programs. Depending on the outcome, each test case will be marked as either passing or failing.

The test cases are divided into three categories: negative, positive Joos 1 and positive Joos 2.

A pure Joos 1 compiler will produce [OKAY] on all negative cases positive Joos 1 cases and [JOOS1] on all positive Joos 2 cases. A pure Joos 2 compiler will produce [OKAY] on all negative cases and positive Joos 1 cases and [JOOS2] on all positive Joos 2 cases.

A Joos compiler is said to be correct with respect to the test suite (a necessary but not sufficient condition for being a correct Joos compiler) if it produces no [FAIL] results.

To reduce the workload of the testing system each testcase has been marked with a set of relevant phases, and a testcase is included in the testrun only if the selected phases include one of the relevant phases. Additionally, each group is allowed to perform 6 'complete test's per day which run the testing framework on all testcases and not just the relevant ones. The complete tests should be used once in a while since they might show some errors inadvertently hidden in the non-relevant testcases.

The -joos1 switch

Giving the -joos1 commandline switch to the compiler (or selecting the -joos1 option from the group page) will make the reference implementation of the compiler phases behave as a Joos 1 compiler rather than a Joos 2 compiler. For all phases except CodeGeneration, all this switch does is to reject programs that use certain Joos 2 language features as defined in the assignment descriptions for each phase. Since a Joos 1 implementation of a compiler phase may assume that these rejections have been done properly in all previous phases, even a correct Joos 1 phase may produce failures on Joos 2 programs if some of the previous phases implement Joos 2 behavior. In other words: A full Joos 2 compiler must of course produce correct Joos 2 results when mixed with the reference implementation phases running without the -joos1 switch.

Decorations, transformations, implementations and checks

The skeleton source code contains for each phase a number of public intertype fields that indicate the decorations in the AST for which the corresponding phase is responsible. It is described in the assignment description for each individual phase what should be put into these fields.

Some phases are responsible for doing some transformations on the AST, i.e. rewriting parts of the AST based on newly calculated information.

For each phase, a number of checks must be performed to validate the correctness of the AST. If any of these checks fail, the compiler must produce an error message (using the error_foo or check_joos1_foo functions available in each phase).

Some phases must additionally implement some functions that later phases can call to query the calculated information.

You may not alter the modules defining the ASTs or the interface .mli files that specify the interfaces of the phases. However you are welcome to add any functions and variables to the modules that you find useful in implementing the individual phases.