Goals
- Practice with writing a self contained C++ program that reads from stdin
- Get you thinking about and using concurrent processes effectively
- Exposure to how Unix based operating systems work on the command line
Collaboration
For assignments in CIT 5950, you will complete each of them on your own or solo. However, you may discuss high-level ideas with other students, but any viewing, sharing, copying, or dictating of code is forbidden. If you are worried about whether something violates academic integrity, please post on Ed or contact the instructor.
Contents
Setup
We encourage you to use the same environment that you setup for the final project so that you can make use of Boost.
For this assignment, you need to setup a Linux C++ development environment. We encourage you to use the same environment that you setup for the final project so that you can make use of Boost.
You can downlowd the starter files into your docker container by running the following command:
curl -o pipeshell.zip https://www.seas.upenn.edu/~cit5950/current/projects/code/pipeshell.zip
You can also download the files manually here if you would like: pipeshell.zip
From here, you need to extract the files by running
unzip pipeshell.zip
From here you can either open the project in Vim or VSCode.
For Vim, you just need to run
cd pipeshell
vim pipe_shell.cpp
For VSCode you will have to follow steps similar to what we did to open chcek_setup
in the setup document.
Overview
Most of the details for what you need to do with regards to handling processes, and creating pipes is covered in lecture. Look for the 2 or 3 most recent lectures that discuss the file descriptor table, pipes, fork and exec.
In this assignment, you will be implementing a very simplified version of the UNIX shell (terminal) that you have been using to compile, run, and debug your code previously in the course.
The shell you write will need to read commands from standard input, handle the execution of any programs in that input, and facilitate piping input from the stdout of one program to the stdin of another program.
For example, if the user inputs the line ls | wc
, your shell should fork off two programs (one for ls
and one for wc
) and establish a pipe from ls
to wc
.
Similarly, if someone were to run ls | head | pipe
, then you should fork three processes and setup two pipes.
This shell that you write will not need to implement most of the complexities of a standard UNIX shell, things like environment variables, or shell features like &
, >
, >>
, <
, &&
, ;
and many other command line symbols.
We have provided a compiled sample solution in the solution_binaries
folder that you can use as a comparison to what your program should do.
If you run the solution binaries, run it like this: ./solution_binaries/pipe_shell
.
Instructions
We highly recommend you use the boost function split()
which is covered in one of the recitations.
You will find the starter code in Codio, where you will also submit the assignment. Among these files, there are:
-
pipe_shell.cpp
a mostly empty file for you which is where you will implement your UNIX shell. -
Makefile
used to compile your program -
sh.cpp
a sample program that takes in a program with optional arguments and executes that command with specified args by forking a process to run it. You may want to look at this for inspiration on how to usefork()
. -
stdin_echo.cpp
a simple program that reads from stdin, and prints everything it reads to stdout until EOF is read in, and then it terminates. This program is implemented for you and you may be useful for debugging. -
example_tests/
a directory containing sample inputs and their corresponding outputs. -
solution_binaries/
a directory containing a compiled sample solution for thepipe_shell
that you can use for testing your code.
For pipe_shell.cpp
there are some specific requirements:
- Your program should read in commands from
stdin
(e.g. thecin
stream) one line at a time. A command consists of a sequence of programs separated by the pipe character|
. - Continue reading and executing commands until you read EOF from stdin or exactly
exit
is input on one line. - Wait for the current command (sequence of programs) to terminate before starting the next command.
- The child programs of a command must execute in parallel.
- Programs can be named by either an absolute path or just by the program name (
execvp
should handle this for you).
If you are not sure of whether you should do certain behaviour, run the solution binary and see what that does. Your program should try to mimic the solution binary’s behaviour. Please ask on piazza if you have any questions about these or any other requirements/constraints for your program.
If you run the solution binaries, run it like this: ./solution_binaries/pipe_shell
.
You can make any changes to pipe_shell.cpp
to implement your shell. Note that you are also free to alter the provided programs stdin_echo
and sh
which may be useful for debugging your shell.
We HIGHLY recommend that you read all of this guide and are familiar with all lectures that covered this material before you start writing any code for this homeework.
Suggested Approach
Below we have provided a suggested approach to this homework. Note that you are not required to follow this ordering if you believe another approach would work better for you. Also note that you can gradually check your progress if you can, which can be a little difficult to do.
- Start by READING THE ENTIRE SPECIFICATION. It shouldn’t be too long and will help with your understanding of the assignment
- Take a look at the provided programs
sh.cpp
, andstdin_echo.cpp
. Make sure you understand what is happening in these programs and try running them yourself. - Run the solution binary for
pipe_shell
and make sure you understand what it is doing. The program you write should have the same behaviour. You can run the solution binary like this:./solution_binaries/pipe_shell
. - Start implementing
pipe_shell.cpp
and start by prompting the user for input by printing out$
. Have your program continually loop reading in a line from the user, printing it back out to them and then re-prompting the user.stdin_echo
may be useful to look at while doing this. - Modify your program so that if a user inputs the end of file character
ctr + d
or types in exactlyexit
and hits enter, then your program should exit gracefully. - Modify your program to handle forking the user input as commands, similar to how
sh.cc
does it, but instead of taking it as a command line arguments through argv, do this for the input read in from stdin, where each line is one command and its command line arguments. - Modify your program to detect for and handle the case where a command has two programs as inputs that are separated by a
|
character (e.g.ls | wc
). If the|
is detected, then your code should fork both commands with a pipe running between the two. - Generalize your code to handle any number of
|
and programs in a single command input. This step is the most complicated step.
Hints
- We highly suggest reading through the entire specification before starting and reviewing the UNIX lecture given in class
- You have access to the boost functions if you would like to #include and use them in your code. Most notably, this includes
split
andtrim
. We highly recommend you make use ofsplit
. - Take inspiration from the provided sample programs and the programs provided in class relating to
pipe
,fork
, andexec
. - Regularly test your code on
valgrind
to make sure you don’t have any memory errors, as this can cause problems in your code. - You will almost certainly want to make use of the
execvp()
,fork()
,pipe()
andwaitpid()
functions in your implementation. - some find it useful to explicitly handle the cases where there is: no pipe, 1 pipe, more than 1 pipe separetely.
- There are two common strategies to creating the pipes needed to execute user input that has more than 1 pipe. one is to use an array of pipe file descriptors (e.g.
int fds[N][2]
, where N is the number of pipes needed) , another is to create a new pipe before each time you fork (with the exception of the last process in the command). - Recall that each child process used to execute a command only needs two pipe fds, the read end from the previous pipe, and the write end to the next pipe.
- Be sure that each process closes all pipe file descriptors it does not use. Not doing this may cause your program not to terminate
Grading & Testing
Compilation
We have supplied you with a Makefile
that can be used for compiling your code into an executable. To do this, open the terminal in codio (this can be done by selecting Tools -> Terminal) and then type in make
.
You may need to resolve any compiler warnings and compiler errors that show up. Once all compiler errors have been resolved, if you ls
in the terminal, you should be able to see an executable called pipe_shell
. You can then run this by typing in ./pipe_shell
and passing in various inputs to test your code your code.
Note that your submission will be partially evaluated on the number of compiler warnings. You should eliminate ALL compiler warnings in your code
Valgrind
We will also test your submission on whether there are any memory errors or memory leaks. We will be using valgrind to do this. To do this, you should try running:
valgrind --leak-check=full ./pipe_shell
If everything is correct, you should see the following towards the bottom of the output:
==1620== All heap blocks were freed -- no leaks are possible
==1620==
==1620== For counts of detected and suppressed errors, rerun with: -v
==1620== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you do not see something similar to the above in your output, valgrind will have printed out details about where the errors and memory leaks occurred.
Testing
To test your implementation of pipe_shell
you can compare the behaviour/output of it to the provided solution binary.
Additionally, we have provided a few sample inputs and outputs in tests
directory that you can use for testing purposes. You can use the provided test files to automate the comparison of results.
For instance, if you wanted to test your code on the simple test case, you can run
cat ./tests/simple_input.txt | ./pipe_shell &> my_output.txt
and then compare the file my_output.txt
to ./tests/simple_output.txt
.
Reading the expected output of these can be a bit difficult though since the expected output files don’t contain the user input. To avoid this, you can use the diff
program which comparse two files and prints any difference between them (or nothing if they are the same).
So one could do
diff my_output.txt ./tests/simple_output.txt
You can combine this with the previous command to do it all on one line with:
cat ./tests/simple_input.txt | ./pipe_shell &> my_output.txt && diff my_output.txt ./tests/simple_output.txt
Please don’t hesitate to post on Ed if you are having troubles with testing your code!
Submission:
Please submit your completed pipe_shell.cc
to Gradescope