Due: 6 April 2015 at 11:59:59pm
Download: hw5.zip
Oat v.1 Language Specification
To Submit: Go Here
This is a group project. Teams of size 1 or 2 are acceptable. If you are working in a team of two, only one student needs to submit the project.
Getting Started
To get started, download the project source files and create an Eclipse project whose executable is main.native or main.byte. The files included in hw5.zip are briefly described below. Those marked with * are the only ones you should need to modify while completing this assignment.util/assert.ml(i) - the assertion framework util/platform.ml - platform-specific compilation support util/range.ml(i) - range datatype for error messages ll/ll.ml - the abstract syntax for LLVMlite ll/lllib.ml - name generation and pretty-printing for LLVMlite ll/lllexer.mll - lexer for LLVMlite syntax ll/llparser.mly - parser generator for LLVMlite syntax ll/llinterp.ml - reference interpreter for the LLVMlite semantics /x86/x86.ml - the X86lite language used as a target README - help about the main test harness Makefile - basic make support for invoking ocamlbuild oatprograms/*.ll - example .oat programs used in testing main.ml - main test harness driver.ml - utilities for invoking the compiler backend.ml - sample solution to HW3 ast.ml - oat abstract syntax astlib.ml - pretty printing ctxt.ml - compilation context lexer.mll - oat lexer parser.mly - oat parser tc.ml - typechecker *tctxt.ml - typechecking context and subtyping implementation *frontend.ml - oat frontend runtime.c - oat runtime library providedtests.ml - where your own test cases should go gradedtests.ml - graded test cases that we provide
Overview
Implement the interesting parts of an OO-language with single inheritance and dynamic dispatch. As discussed in class, the Oat type system distinguishes possibly-null references from definitely not-null references. Your compiler will accept source files of the form below (seeclass A <: Object { new (int y)() { int x = y; } void print() { print_string(string_cat(string_cat("CIS", string_of_int(x)), " ROCKS")); return; } } class B <: A { new ()(120) {} void print() { super.print(); print_string(" too!"); return; } } int program (int argc, string[] argv) { A? a = null; a = new A(341); A b = new B(); if? (A x = a) { x.print(); } print_string("; "); b.print(); return 0; }and will produce an executable (by default named a.out) that, when linked against runtime.c and then executed produces the resulting output:
./a.out CIS341 ROCKS; CIS120 ROCKS too!
The main challenge of this project is implementing the dynamic dispatch, class-tables, and dynamically-checked type cast operations. As a warm up, you will first complete some parts of the typechecker.
The Oat Language
The language accepted by your compiler is a simple Java-like language that supports all of the types previously supported, along with class types and possibly-null types. Oat has a null value, but the typechecker prevents null-pointer dereferences by forcing programmers to explicitly check whether a possibly-null pointer is actually null using the if? statement.At this point, the Oat language is large enough that it is simpler to give the specification using inference rules than to use English prose. The Oat language specification contains a definition of the the full language syntax and a collection of inference rules that define Oat type checking. (Though it omits some of the definitions for the sake of brevity; the code in tc.ml implements the specification completely.)
See the file ast.mli for the OCaml representation of the abstract syntax -- the type typ of types is defined there, along with representations of expressions, statements, blocks, function declarations, etc. You should familiarize yourself with the correspondence between the OCaml representation and the notation used in the specification. The Astlib module defines some helper functions for working with the abstract syntax.
New Features
Classes and Object:Oat now supports class declarations with single inheritance. Classes can contain fields and methods, and they are required to have one constructor. For examples of the syntax, see the .oat files in the /oatprogram project directory.
The fields of a class are define inside the constructor block. In the example code from above, the class A defines the field x in its constructor (initializing it to y).
Oat supports standard dynamic dispatch -- see the code in tests/dynamic.oat for an example.
New expression forms:
Oat supports several new forms of expressions to allow field projection and
method invocation from a "path" of function/method calls, array indices, and
object fields. For example, you can write o.m(3).foo[7].x. Oat also
supports this and super with semantics analogous to that in
Java.
Possibly-Null References:
Reference types in Oat are forced to be non-null. To allow for null pointers,
Oat has types of the form C?, which is a reference to either an
object of type C or to null. Null checks are written using
the if? statement, for example:
C? nc = ...; // nc is a possibly null pointer to a C object // nc.m(); // this method invocation fails to typecheck // since nc might be null if? (C c = nc) { c.m(); // this method invocation succeeds }The example program illustrates how the null check can be used in practice.
Dynamically-checked Casts:
Oat also supports dynamically-checked downcasts, which allows the programmer to
refine the type of an object in a safe way. Such downcasts use the
same notation as the null checks, for example:
Object o = new C(); if? (C c = o) { // check whether o has type C, if so name it c c.m(); // here c has type C }
Task I: Type checking
We have provided you with a nearly complete front-end (lexer/parser/type checker) for full Oat. For this part of the assignment, you will implement a small part of the type system in tctxt.ml as a way to familiarize yourself with the parts of the language specification related to objects.
The missing part of the typechecker is the join_typ operation on types, which is derivable from the subtyping rules in the Oat specification.
Hints
- Before you begin, make sure you understand how the class heirarchy is represented in ctxt by looking over tc_fctxt.
- The frontend depends on the invariants set up by the typechecker, so although this part of the assignment isn't much code, you'll need to spend some time understanding the spec and getting this right.
Task II: Frontend Compilation: Constructors and NewObj
The provided frontend.ml code can already compile the subset of Oat not including class declarations and method calls. In this part of the project, you will complete the implementation, focusing on the new operations.
At the IR level, objects and vtables are both represented by structs. Objects are heap-allocated using the oat_malloc function which, like the dynamic array allocation from HW4, requests an uninitialized piece of memory from the runtime.An object constructor initializes the fields of this struct and reserves the first field for the vtable pointer. Vtables are simply global structs of function pointers, where the first field is a pointer to the vtable of the super class. The code that generates vtables is in the cmp_hierarchy function in frontend.ml.
Constructors
A constructor is just a global function that takes a this pointer to a freshly allocated and uninitialized object in memory. The constructor initializes the object by first calling the super class constructor, then initializing fields of the object, and setting the vtable pointer.Complete the implementation of the functions cmp_ctr in frontend.ml.
Hints
- Use the cmp_args function to compile argument lists and extend the local context.
- Use the cmp_fields function to compile the class fields.
- Our implementation was only ~30 loc. You might might not be on the right track if your code is significantly longer.
New
The counterpart to the constructor is the new keyword, which invokes the constructor to create a new object. Next, complete the NewObj case of cmp_exp.
Path Expressions: Method Invocation
In Oat, a path expression can be used as a left hand side to update the value of an object's field, as an expression to access the field of an object, or to invoke a method on an object. A path consists of either the keyword 'super', a left-hand-side, or a call that returns an object reference followed by the identifier of a field or method. Here, we first complete the case of dynamic dispatch.Complete the dynamic dispatch case of the cmp_call function in frontend.ml. See the comments there for more help.
Hints
- Remember that the first field of an object struct contains the object's vtable pointer, and not the first field of the object.
- Remember that the first field of a vtable struct contains a pointer to the super class' vtable, and not the first method.
- You may find the lookup_method helper function useful.
- Our implementation was only ~30 loc. You might might not be on the right track if your code is significantly longer.
Super Methods
In order to call shadowed methods, Oat needs a mechanism to explicitly refer to methods of super class from inside a method. Complete the super case of cmp_call. At this point, many of the test cases for the full compiler should be passing, and your compiler should work with any valid Oat program that doesn't use the if? statement.Dynamic Cast
Implement the Cast statement. This is the dynamic counterpart to the subtype function in the typechecker. As such, the code you generate must dynamically search up the class heirarchy to test whether the given object's type is a subtype of the one mentioned statically in the Cast statement. This search is essentially a while loop that does pointer dereferences to load the type information from the object representations. Recall that the run-time representation of an object's class is its vtable (dispatch label) pointer.The hardest part about implementing cast is satisfying the llvm type system. Since we only care about comparing the addresses of the vtables and only need to access the first pointer, one approach is to treat vtables as just their addresses: in the loop that traverses the chain of vtable pointers, bitcast the vtable pointer to (Ptr (Ptr I8)) prior to loading from the address, then compare the resulting (Ptr I8) against the address of the subclass' vtable pointer, and against the operand (Ptr I8, Null) for the loop guard.
Note that Oat also folds in null-checks as part of the dynamic type tests. (That is because null is a type.) The comments in the Cast case of cmp_stmt suggest the case analysis you need to do.
Hints
- Remember that the first field of an object struct contains the object's vtable pointer, and not the first field of the object.
- Remember that the first field of a vtable struct contains a pointer to the super class' vtable, and not the first method.
- Recall that these vtable pointers are the dynamic type information about an object.
- There are several reasonable ways to implement the search, which amounts to a loop or a recursive tree traversal.
- You will probably
Testing and Debugging Strategies
The test harness provided by main.ml gives several ways to assess your code. See the README file for a full description of the flags.
Graded Test Cases
As part of this project, you must post an interesting test case for the compiler to the course Piazza site. This test case must take the form of a .oat file along with expected input arguments and outputs (as found in the hard tests of gradedtests.ml).
The test case you submit to Piazza will not count if it is too similar to previously-posted tests! Your test should be distinct from prior test cases. (Note that this policy encourages you to submit test cases early!)
- Class definitions.
- Constructors.
- Dynamic dispatch through inherited methods.
- Dynamic dispatch through overriden methods.
- Dynamically-checked downcasts.
We will validate these tests against our own implementation of the compiler (and clang). A second component of your grade will be determined by how your compiler fairs against the test cases submitted by the other groups in the class.
Grading
Projects that do not compile will receive no credit!Your team's grade for this project will be based on:
- 90 Points: the various automated tests that we provide.
- 5 Points for posting an interesting test case to Piazza. (Graded manually.)
- 5 Points divided among the test cases created by other groups. (Graded manually.)