The goal of this project is to write in OCaml a compiler translating
from Joos (a subset of Java) to Java bytecode.
The compiler project is split into a number of phases which are run on
the input program in sequence to produce the final output. These
phases are grouped into a number of hand-in assignments. The
responsibilities of each phase, and the interfaces between them, are
precisely defined in the hand-in assignment descriptions.
The Joos languages
Three different subsets of Java are defined: Joos 0, Joos 1 and Joos
2. These are ordered such that Joos 0 is a subset of Joos 1, which is
a subset of Joos 2, which is a subset of Java 1.3. The languages are
defined by the Java features which they contain.
Joos 0 is a very simple language. A compiler for Joos 0 is given as an
example of a complete compiler, which can be used as inspiration for
the project.
Joos 1 defines the minimum set of language features your compiler must
support. A skeleton for a Joos 1 compiler is given. Your job is to
fill out the (many) missing parts to obtain a complete Joos 1
compiler.
Joos 2 defines a set of additional language features you can implement
support for in your compiler in order to earn extra credit. An
executable reference implementation of Joos 2 is available.
Commandline arguments
The compiler takes (along with the list of source files to compile)
the following commandline arguments:
- -classpath classpath (or -cp
classpath): Specify classpath where
the compiler should search for library class files.
- -joos1: Tells the reference compiler to reject
programs that use language features not in Joos 1. You are not
required (though welcome) to support this switch in your own
compiler.
- -noverify: Do not abort the compilation if some internal
code consistency checks fail (in the Limits phase). Useful
when debugging the code generation and optimization phases.
Support modules
In addition to the skeleton phases, the compiler skeleton contains a
number of support modules which you are advised to familiarize yourself with:
- Main: The commandline interface to the compiler.
- Error: Error reporting functions and a datatype of
all possible compiler errors. Any error in the
input program should be reported through the functions in this
module.
- Classenvironment: Interface to the class
library. Contains functions for searching the classpath for class
files and reading these class files.
- Classfileparser: Used by Classenvironment to
parse class files.
- Utils: Contains various useful helper functions and
for processing the AST.
Support tools
You will be using the following tools for the project:
- ocamlc / ocamlopt: The OCaml (bytecode/native code) compiler. Used to compile the Joos compiler.
- ocamlfind: Compiler utility to search for installed
Ocaml modules.
- ocamllex: Scanner generator.
- ocamlyacc: Parser generator.
- jasmin: Java bytecode assembler. Used to translate the
.j files produced by the Joos compiler into Java class
files.
- JavaLib-2.2: Java Bytecode Library. Used by
Classfileparser to parse class files.
- make: Use the Makefile provided, since it
contains all the right options and dependencies for building the Joos
compiler.
- joosc: Script provided along with the skeleton for
running the Joos compiler.
- joos2c: The reference Joos 2 compiler.
The testing framework
On your group page, you have the possibility of testing a selection of
compiler phases on the supplied test suite of Java programs. This will
build a compiler using your implementation of the selected phases and
the reference implementation for the rest and run this compiler on all
the test programs. Depending on the outcome, each test case will be
marked as either passing or failing.
The test cases are divided into three categories: negative, positive
Joos 1 and positive Joos 2.
- Negative cases are marked as as [OKAY] if the compiler
produces
an error message and [FAIL] otherwise.
- Positive Joos 1 cases are marked as [OKAY] if the program
compiles, assembles and runs correctly and [FAIL] otherwise.
- Positive Joos 2 cases are marked as [JOOS2] if the program
compiles, assembles and runs correctly, [JOOS1] if the compiler
produces an error message and [FAIL] otherwise.
A pure Joos 1 compiler will produce [OKAY] on all negative cases
positive Joos 1 cases and [JOOS1] on all positive Joos 2 cases.
A pure Joos 2 compiler will produce [OKAY] on all negative cases
and
positive Joos 1 cases and [JOOS2] on all positive Joos 2 cases.
A Joos compiler is said to be correct with respect to the test suite
(a necessary but not sufficient condition for being a correct Joos
compiler) if it produces no [FAIL] results.
To reduce the workload of the testing system each testcase has been marked with a set
of relevant phases, and a testcase is included in the testrun only if the selected phases
include one of the relevant phases. Additionally, each group is allowed to perform 6
'complete test's per day which run the testing framework on all testcases and
not just the relevant ones. The complete tests should be used once in a while since
they might show some errors inadvertently hidden in the non-relevant testcases.
The -joos1 switch
Giving the -joos1 commandline switch to the compiler (or
selecting the -joos1 option from the group page) will make
the reference implementation of the compiler phases behave as a Joos 1
compiler rather than a Joos 2 compiler.
For all phases except CodeGeneration, all this switch does is
to reject programs that use certain Joos 2 language features as
defined in the assignment descriptions for each phase. Since a Joos 1
implementation of a compiler phase may assume that these rejections have
been done properly in all previous phases, even a correct Joos 1 phase
may produce failures on Joos 2 programs if some of the previous phases
implement Joos 2 behavior. In other words:
- If you are running with the -joos1 switch, your compiler
has to produce no failures, no matter how you combine your phases
with the reference implementation.
- If you are running without the -joos1 switch, and your
compiler does not implement all Joos 2 language features, and you
have a reference implementation phase running before one of your own
phases, then failures among the positive Joos 2 tests are
acceptable.
A full Joos 2 compiler must of course produce correct Joos 2 results
when mixed with the reference implementation phases running without
the -joos1 switch.
Decorations, transformations, implementations and checks
The skeleton source code contains for each phase a number of public
intertype fields that indicate the decorations in the AST for
which the corresponding phase is responsible. It is described in the
assignment description for each individual phase what should be put
into these fields.
Some phases are responsible for doing some transformations on
the AST, i.e. rewriting parts of the AST based on newly calculated
information.
For each phase, a number of checks must be performed to
validate the correctness of the AST. If any of these checks fail, the
compiler must produce an error message (using the
error_foo or check_joos1_foo functions
available in each phase).
Some phases must additionally implement some functions that later phases
can call to query the calculated information.
You may not alter the modules defining the ASTs or the interface
.mli files that specify the interfaces of the phases.
However you are welcome to add any functions and variables to the
modules that you find useful in implementing the individual phases.