Due: 13 April 2017 at 11:59:59pm
Download: hw5.zip
Oat v.2 Language Specification
To Submit: Go Here
This is a group project. Teams of size 1 or 2 are acceptable. If you are working in a team of two, only one student needs to submit the project.
Getting Started
To get started, download the project source files and make sure you can compile main.native or main.byte. The files included in hw5.zip are briefly described below. Those marked with * are the only ones you should need to modify while completing this assignment.util/assert.ml(i) - the assertion framework util/platform.ml - platform-specific compilation support util/range.ml(i) - range datatype for error messages ll/ll.ml - the abstract syntax for LLVMlite ll/lllib.ml - name generation and pretty-printing for LLVMlite ll/lllexer.mll - lexer for LLVMlite syntax ll/llparser.mly - parser generator for LLVMlite syntax ll/llinterp.ml - reference interpreter for the LLVMlite semantics /x86/x86.ml - the X86lite language used as a target README - help about the main test harness Makefile - basic make support for invoking ocamlbuild atprograms/*.oat - example .oat v.1 programs used in testing hw5programs/*.oat - example .oat v.2 programs used in testing main.ml - main test harness driver.ml - utilities for invoking the compiler backend.ml - sample solution to HW3 astlib.ml - pretty printing lexer.mll - oat lexer parser.mly - oat parser Most important for this project: ---------------------------------- ast.ml - oat abstract syntax *frontend.ml - oat frontend [including most of a solution to HW4] *typechecker.ml - oat typechecker tctxt.ml - typechecking context data structure ---------------------------------- runtime.c - oat runtime library providedtests.ml - where your own test cases should go gradedtests.ml - graded test cases that we provide
Overview
Implement a compiler typechecker for an extended version of Oat that has boolean, int, string, array, struct and function pointer types as well as top-level, mutually-recursive functions and global variables. Your compiler will now accept source files of the formstruct Color { int red; int green; int blue; (Color) -> Color f } Color rot(Color c1) { var c2 = new Color{ red = c1.green; green = c1.blue; blue = c1.red; f = c1.f }; return c2; } global white = Color { red = 10; green = 20; blue = 30 ; f = rot}; int program (int argc, string[] argv) { return white.f(white).red; }and will produce an executable (by default named a.out) that, when linked against runtime.c and then executed produces the resulting output:
./a.out 20
The Oat Language
This version of Oat supports all of the features from HW4 but, in addition, support structs and function pointers.
Oat supports multiple base-types of data: int, bool, and string, as well as arrays of such data. The Oat language is large enough that it is simpler to give the specification of its type system using inference rules than to use English prose. The Oat language specification contains a definition of the language syntax and a collection of inference rules that define Oat type checking.
See the file ast.ml for the OCaml representation of the abstract syntax -- the type typ of types is defined there, along with representations of expressions, statements, blocks, function declarations, etc. You should familiarize yourself with the correspondence between the OCaml representation and the notation used in the specification. The astlib module defines some helper functions for printing Oat programs and abstract syntax.
New Features
Structs:Oat structs are declared by using the struct keyword at the top level. For example the following program declares a new struct type named Color with three fields. (Note: there is no trailing semicolon on the last struct field.)
struct Color { int red; int green; int blue } int program (int argc, string[] argv) { var garr = new Color { green = 4; red = 3; blue = 5 }; garr.red = 17; return garr.red + garr.green; }
Struct values can be create by using the new keyword followed by the name of the struct type and a record of field initializers. The order of the fields in the initializers does not have to match the order of the fields as declared, but all of the fields must be present and have the correct type.
Struct values are represented internally as pointers to heap-allocated blocks of memory. This means that structs, like strings and arrays, are reference values for the purposes of compilation.
Struct fields are also mutable. As shown in the sample program above, you can update the value of a struct.
Function Pointers:
This version of Oat supports function pointers as first-class values. This means that the name of a top-level declared function can itself be used as a value. For example, the following program declares a top-level function inc of type (int) -> int and passes it as an argument to another function named call:
int call((int) -> int f, int arg) { return f(arg); } int inc(int x) { return x + 1; } int program(int argc, string[] argv) { return call(inc, 3); }
Function types are written (t1, .., tn) -> tret and, as illustrated above, function identifiers act as values of the corresponding type. Note that such function identifiers, unlike global variables do not denote storage space, and so cannot be used as the left-hand side of any assignment statement. These function pointers are not true closures, since they cannot capture variables from a local scope.
Built-in Functions:
The built-in functions, whose types are given below, can also be
passed as function-pointers:
- string_of_array : (int[]) -> string Assumes each int of the array is the representation of an ASCII character.
- array_of_string : (string) -> int[]
- print_string : (string) -> unit
- print_int : (int) -> unit
- print_bool : (bool) -> unit
Task I: Typechecking
The main task of this project is to implement a typechecker for the Oat language. Alas, because this version of oat still supports null, ths typechecker will not guarantee the absence of null pointer dereferences, but it will rule out many other bugs, such as using incompatible structures or mis-using function pointers.
The typing rules describing the intended behavior of the typechecker are given in the accompanying Oat v.2 Language Specification. Use that, and the notes in typechecker.ml to get started. We suggest that you tackle this part of the project in this order:
- Try to read over the typing rules and get a sense of how the notion of context used there matches up with the implementation in tctxt.ml.
- Complete the implementation of typecheck_ty and its mutually recursive companions, so remind yourself how the typechecking rules line up with the code that implements them.
- Think about the intended behavior of the typechecker for expressions, and work out a few of the easier cases. We have given you helper functions for typechecking the primitive operations.
- Next tackle the context-building functions, which create the correct typing context for later typechecking.
- Take an initial stab at typecheck_fdecl. We suggest that you introduce a helper function called typecheck_block, which will be used for function declarations (and else where)
- Working backwards through the file, work on typechecking statements, which will rely heavily on typechecking expressions. Make sure you understand how the expected return type information and the behavior type of the statement are propagated through this part of hte code.
Task II: Frontend Compilation
We have provided a mostly complete sample implementation of the frontend of the compiler (corresponding to our solution to HW4). Your task for this part of the project is to add support for structures and function pointers.
Structs: There are only a few places where the code must be modified, each of which is marked as a TASK in the comments. Struct compilation is (broadly) similar to how we handle arrayes. The main difference is that you need to use the structure information to look up the field index for the struct representation (rather than computing an integer directly). Follow the TASK breadcrumbs left in the frontend and the comments there.
Testing and Debugging Strategies
The test harness provided by main.ml gives several ways to assess your code. See the README file for a full description of the flags.
Graded Test Cases
As part of this project, you must post an interesting test case for the compiler to the course Piazza site. This test case must take the form of a .oat file along with expected input arguments and outputs (as found in the hard tests of gradedtests.ml).
The test case you submit to Piazza will not count if it is too similar to previously-posted tests! Your test should be distinct from prior test cases. (Note that this policy encourages you to submit test cases early!)
We will validate these tests against our own implementation of the compiler (and clang). A second component of your grade will be determined by how your compiler fairs against the test cases submitted by the other groups in the class.
Grading
Projects that do not compile will receive no credit!Your team's grade for this project will be based on:
- 90 Points: the various automated tests that we provide.
- 5 Points for posting an interesting test case to Piazza. (Graded manually.)
- 5 Points divided among the test cases created by other groups. (Graded manually.)