One of the hallmarks of a good C programmer (and really any programmer in general) is having a strong mental model of how memory is handled. In this homework, we'll go through a series of exercises to strengthen our understanding of how our code manipulates memory.
First, we'll step away from coding for a bit to focus on how each instruction we issue modifies memory. We'll be creating memory diagrams that show what memory looks like after we execute our programs. Outside of being a good academic exercise, memory diagrams are a great tool to understand complex code. If you're ever stuck trying to debug a memory leak, seg fault, or some otherwise ungodly piece of code, don't be afraid to bust out a piece of paper and trace what's going on.
Before we begin the problems, we'll step through the basics here so you can get a feel for how to create these diagrams on your own.
Recall that the memory of our program is divided into the stack, the portion of memory dedicated to local variables and function stack frames, and the heap, the portion of memory dedicated to dynamic allocation via malloc. Thus whenever we declare a local variable, we are actually allocating space on the stack (and associating that space with a local variable) and when we call malloc, we are actually allocating space on the heap (and receiving a pointer to that memory).
Take a look at the following code, which has both local variables (which go on the stack) and dynamically allocated memory (which is put on the heap).
int main() { int x; int xs [3]; return 0; }
After this block of code executes, we will have the following stack:
Stack ----- (main) x: | ? | xs: | ? | | ? | | ? | -----
Some things to note in this diagram:
Here is an example of code that manipulates the variables declared above. Before looking at the corresponding memory diagram, start with the previous diagram and trace through the code, updating the diagram for each instruction you "execute". This is generally how you should go about the memory diagram problems coming up.
int f(int y) { return y + 2; } for (x = 0; x < 2; x++) { xs[x] = f(x); }
Here is a diagram of the code as the first call to function f returns.
Stack ----- (f) y: | 0 | (main) x: | 0 | xs: | ? | | ? | | ? | f(x): | 2 | -----
As you can see, the return value of f(x) has been copied into main's stack but has not yet been copied into xs[x]. All of the values in xs are still uninitialized because we have not written to them yet. Also note that the return value from the call to f(x) is allocated as soon as main is called, not when f(x) is called.
Here is the final memory diagram after the above code executes.
Stack ----- (main) x: | 2 | xs: | 2 | | 3 | | 4 | -----
Note that the stack of f is no longer listed because the function has finished executing. Thus, its memory is no longer accessible. Accessing this memory could lead to a dangling pointer error.
As discussed in class, memory can also be allocated on the heap and manipulated indirectly through a pointer. The following example shows how to diagram the memory allocated on the heap.
int getValue(int *p){ return *p; } int main() { int *i = new int; *i = 5; int x = getValue(i); return 0; }
Here is the diagram of this code as getValue(...) is executing.
Stack Heap ----- ---- (getValue) Addr1: |5| p: |-->Addr1| (main) i: |-->Addr1| x: |?| getValue(i): |?|
As you can see, we represent pointers on the stack in the form -->Address. (Note: This notation is not necessarily standard.)
With this in mind, here are few problems to tackle. For each of the commented points in the code snippets below (e.g., /* 1 */), draw a memory diagram representing the current state of memory at that point in the program.
/****************/ /** Problem #1 **/ /****************/ int main() { int x; int y = 5; int *z; /* 1 */ return 0; }
/****************/ /** Problem #2 **/ /****************/ int g(int z) { /* 1 */ return z + 5; } int f(int y) { int a = 10; return a + g(y); } int main() { int x = 2; int result = 0; result = f(x); /* 2 */ return 0; }
In this next problem, there is only 1 point at which you'll need to draw a stack diagram. You can assume that the value of n at that point is 4. This point in the code can be reached by calling fibonacci(5) and then calling fibonacci(n-1) recursively from the else branch.
/****************/ /** Problem #3 **/ /****************/ int fibonacci(int n) { if(n == 0){ return 0; }else if(n == 1){ return 1; }else{ /* 1 (n == 4) */ return fibonacci(n-1) + fibonacci(n-2); } } int main(int argc, char **argv) { int i; int num = 3; int values[3] = {0,1,5}; for(i = 0; i < num; i++){ values[i] = fibonacci(values[i]); } return 0; }
/****************/ /** Problem #4 **/ /****************/ void swap(int *lhs, int *rhs) { int tmp = *lhs; *lhs = *rhs; *rhs = tmp; /* 1 */ } int main() { int *p1 = new int; int *p2 = new int; *p1 = 5; *p2 = 6; swap(p1,p2); return 0; }
For this part of the assignment, you will use gdb to figure out the password required by two simple programs. Each program will ask for a password that can be determined by stepping through the code using gdb. Note: You can easily find the password by looking at the program, but for practice, you should use gdb to find the solution rather than simply looking at the C++ source!
If you have previously cloned the CIS190 repository, you can simply run git pull to update your copy of the repository to include the hw2 files. The CIS190/hw2 directory should contain the necessary source files.
Download the source files to your eniac drive using the following commands:
wget http://www.cis.upenn.edu/~cis190/fall2015/hw/hw2files/password1.cpp . wget http://www.cis.upenn.edu/~cis190/fall2015/hw/hw2files/password2.cpp . wget http://www.cis.upenn.edu/~cis190/fall2015/hw/hw2files/password3.cpp .
Compile the executables using the following commands:
g++ -o password1 password1.cpp -g g++ -o password2 password2.cpp -g g+= -o password3 password3.cpp -g
Run both of these programs with gdb and identify the password. To get you started, here are a few commands that you should run to start off.
gdb ./password1 ... .. Reading symbols from /mnt/eclipse/acg/users/delozier/scratch/password1...done. (gdb) break main Breakpoint 1 at 0x400b24: file password1.cpp, line 9. (gdb) run
From there, you can use the commands from class (step, next, break, continue, where, list) to identify the password.
Once you have identified the password, please submit a text file named passwords.txt to Canvas that lists the passwords for both programs and the transcript from gdb that shows how you found the password. You can copy and paste the output of your gdb session into the text file. In the event that the password may change from one execution to the next, please describe how the password is generated.
As discussed in class, pointer variables store addresses and can be used to indirectly access data. For example, in the following program, a pointer stores the address of the stack variable x and can be used to modify the value of x.
#includeint main() { int x = 0; int *p = &x; *p = 5; printf("%d\n",x); return 0; }
When this program is executed, it will print 5 because the pointer p was used to modify the memory that stores the value of x.
Download the program stack_safety.cpp from either github (in the CIS190/hw2/ folder) or from http://www.cis.upenn.edu/~cis190/spring2015/hw/hw2files/stack_safety.cpp using wget.
Compile stack_safety.cpp and run the program. You may notice that it prints a strange result. Explain how this strange result occurred.
If you are submitting a paper copy of the diagrams for part 1 of the homework, you will need to submit the paper copy either in class on Monday or during office hours on Wednesday. Your diagrams can be drawn by hand, done in ASCII like the examples above, or done in a graphics program, but they must be legible and clearly labelled by question number. Make sure to leave room for the graders to write comments and points earned! Uninitialized memory (values and addresses) should be represented with a '?'.
Hint: If you are having trouble drawing the memory diagrams for each of these programs, you are welcome to step through the program's execution using gdb. Setting a breakpoint at the comment line may help you to figure out what the stack looks like at that point in the execution.