Due: Thursday, January 31st at 11:59pm

Source:

Download: project1.zip


x86Lite Specification

To Submit: Go Here

Overview

In this project you will implement an interpreter for the small subset of X86 that is the target language for the compilers we build in later projects. This project will continue to help get you up to speed with OCaml programming -- we'll need a few more constructs not used on Project0. You will also implement a non-trivial assembly-language program by hand to familiarize yourself with the workings of the X86 architecture.
Note: You may work alone or in pairs for this project.

Getting Started

To get started, download the project source files and create an Eclipse project whose executable is main.native or main.byte. The files included in project1.zip are briefly described below. Those marked with * are the only ones you should need to modify while completing this assignment.
assert.ml         - the assertion framework
gradedtests.ml    - graded test cases that we provide
main.ml           - the main test harness

x86.mli           - the X86lite interface 
x86.ml            - the X86lite instruction set implementation

interpreter.mli   - the interpreter's interface

*interpreter.ml    - where your interpreter code (Part I) should go
*providedtests.ml  - where your submitted test cases (Part II) should go
*team.txt          - your group information

Part I: The Interpreter

X86lite assembly code is organized into labeled blocks of instructions, which might be written in concrete syntax as shown below.
fact:
        pushl %ebp
        movl %esp, %ebp
        movl 8(%esp), %eax
        cmpl $0, %eax
        jG fact_recurse
        movl $1, %eax
        popl %ebp
        ret
fact_recurse:
        subl $1, %eax
        pushl %eax
        call fact
        addl $4, %esp
        movl 8(%esp), %ebx
        imull %ebx, %eax
        popl %ebp
        ret
main:
        pushl $4
        call fact
        addl $4, %esp
        ret
This code has three blocks, labeled fact, fact_recurse, and main. The code at labels fact and fact_recurse implements a recursive version of the familiar factorial function. The code at main calls factorial with the immediate value 4.

In this project you will implement an interpreter for X86lite programs, but rather than using the concrete syntax shown above, you will interpet programs written in an abstract syntax. For example, the factorial program above is represented using X86lite abstract syntax (and meta-level functions) as shown below. The main structure is a list of insn_blocks, each of which is just a record that has a lbl and a list of insn values.

[
(mk_insn_block (mk_lbl_named "fact") [
      Push (ebp);
      Mov (ebp, esp);
      Mov (eax, (stack_offset 8l));
      Cmp (eax, Imm 0l);
      J (Sgt, (mk_lbl_named "fact_recurse"));
      Mov (eax, Imm 1l);
      Pop (ebp);
      Ret
    ]); 
(mk_insn_block (mk_lbl_named "fact_recurse") [
      Sub (eax, Imm 1l);
      Push (eax);
      Call (Lbl (mk_lbl_named "fact"));
      Add (esp, (Imm 4l));
      Mov (ebx, (stack_offset 8l));
      Imul (Eax, ebx);
      Pop (ebp);
      Ret
    ]); 
(mk_insn_block (mk_lbl_named "main") [
      Push (Imm 4l);
      Call (Lbl (mk_lbl_named "fact"));
      Add (esp, (Imm 4l));
      Ret
   ]);
]
As you can see, the correspondence between the abstract syntax and the concrete syntax is quite close. The file x86.ml and its corresponding interface x86.mli together provide the basic definitions for the creating and manipulating X86lite abstract syntax -- the main types you should be aware of are lbl, reg, opnd, cnd, insn, and insn_block. Each of these corresponds pretty directly to a concept from the X86lite assembly language.

The x86Lite Specification gives the full details about the meaning of each instruction.

Read (or at least skim) the x86Lite Specification now. You might want to correlate the various parts of the X86lite machine with the datatypes defined in x86.ml.
The X86lite specification is written from the point of view of actual X86 hardware. The interpreter you implement should be faithful to that specification, except for a few points describe below. Our ML-level interpreter's representation of the X86lite machine state is given by the this type:
type x86_state = {
    s_mem : int32 array;    (* 1024 32-bit words -- the heap *)
    s_reg : int32 array;    (* 8 32-bit words -- the register file *)
    mutable s_OF : bool;    (* overflow flag *)
    mutable s_SF : bool;    (* sign bit flag *)
    mutable s_ZF : bool;    (* zero flag *)
}
The memory and register files are simulated by OCaml-level (mutable) arrays of int32 values. The three condition flags are mutable boolean fields; all of the state is bundled together in a record (see IOC Chapter 8.1 for more about OCaml's record types). The main differences between the interpreter and real X86 hardware are:

Tasks

Complete the implementation of the interpreter.mli interface in the interpreter.ml file, some parts of which are given to you. We recommend that you do things in this order:
Hints:
  • We have provided several helper functions for 32- and 64-bit arithmetic and for getting a bit out of a 32-bit word. You might find them useful.
  • You'll probably want a function that sets the three condition flags after a result has been computed.
  • Groups of instructions share common behavior -- for example, all of the arithmetic instructions are quite similar. You probably want to factor out the commonality as much as you can (to keep your code clean).
  • You will probably want to develop small test cases to try out the functionality of your interpreter. See gradedtests.ml for some examples of how to set up tests that can look at the final state of the machine.

Tests

We will grade this part of the assignment based on a suite of tests. Some of them are available for you to look at in gradedtests.ml, the rest of them we reserve for our own cases. We will also stress-test your interpreter on a number of "big" programs (see Part II) that we have developed and that your classmates will develop as part of this project.

To help other teams debug their interpreters, you are encouraged to share "microbenchmark" test cases on the class discussion list. These should be short (2-3 instruction) programs that test various functional aspects of the interpreter. We will not use these tests in our grading. You can add such test cases to the test suite defined in providedtests.ml

Part II: X86lite Assembly Programming

For this part of the assignment, you will create (by hand) a non-trivial X86lite assembly program to test your interpreter's behavior and get some experience programming in X86lite. The factorial program supplied with the test code is an example of what we mean by "non-trivial" -- test cases of roughly this level of difficulty can earn full credit. In particular, your program should include:

Some good candidates for such programs include: simple sorting or searching algorithms (i.e. treat some chunk of memory as an array of values to be sorted), simple arithmetic algorithms such as gcd, recursive functions over integers or very simple data structures such as linked lists. If you are unsure whether the test case you'd like to implement is sufficient, contact one of the course staff.

Your test should be a function of type unit -> unit that works in the assertion framework (as defined in assert.ml). This test should be supplied in providedtests.ml as the "Student-Provided Big Test for Part II". We will hand grade this submission (and test it against our own interpreter). We will also use all "correct" submitted tests to validate all of the other projects in the course -- the trickier your test is, the harder it will be for other teams to pass it!

Unlike the "microbenchmark" tests from Part I, do not post your "big" test to the class mailing list.

Grading

Projects that do not compile will receive no credit!

Your team's grade for this project will be based on:

Last modified: Sat Jan 26 17:32:14 EST 2013