Goals
In completing this assignment, you will learn how to:
- Share data between threads in a program
- Approach an “embarrassingly parallel” problem using threads
- Identify and resolve race conditions
Collaboration
For assignments in CIT 5950, you will complete each of them on your own or solo. However, you may discuss high-level ideas with other students, but any viewing, sharing, copying, or dictating of code is forbidden. If you are worried about whether something violates academic integrity, please post on Ed or contact the instructor.
Contents
Overview
Note: This assignment is split into two parts, but you are free to work on either part 1 or part 2 to start. You do NOT need to complete part 1 to complete part 2 or vice-versa. However, you will need to complete step 1 of part 2 before completing step 2 of part 2.
Image Blurring
In this part of the assignment, you will implement a multithreaded solution to an “embarrassingly parallel” problem, i.e. one that can obviously be improved by doing parts of it concurrently.
Many image manipulation algorithms are “embarrassingly parallel” because they modify individual pixels or groups of pixels, and changes to one pixel can be made independently of changes to others.
As you may know, a bitmap (BMP) image is one in which the pixels of the image are arranged in a grid, and each pixel’s color is defined using a combination of red, green, and blue (RGB).
Here, you will implement a simple algorithm to make an image appear blurry by setting each pixel’s RGB color to the average of those around it.
The number of surrounding pixels to include in calculating the average is determined by the “box size” (this approach to blurring is known as a “box blur”), using an x-y coordinate system.
In the example below, we want to blur the pixel at coordinate (x, y). If the box size is 1, we consider all pixels in the square with (x-1, y-1) in the upper left corner and (x+1, y+1) in the lower right corner. If the box size is 2, we consider the square defined by (x-2, y-2) and (x+2, y+2).
In general, the RGB color for the pixel at (x, y) should be the average of the pixels in the square defined by (x-b, y-b) and (x+b, y+b), including the pixel itself, where b is the box size. However, if part of the box is outside the valid range of the image, you should only take the average of the pixels that are actually inside the image.
As an example, here’s a bitmap image that you can start with (it’s on Codio as doge.bmp):
Here it is with a blur box of size 8:
And with a blur box of size 20:
(Much blurry. Wow. So fuzzy.)
Note that there are variations of the box blur algorithm that weigh the average based on how close the pixels are, or that make multiple blurring passes, but for simplicity, you only need to implement the algorithm as it is described above.
See the Instructions section below for details on the code files and how you should write some programs to do this
Sharing Data
In this part of the assignment, you will write a simple multithreaded program in which one thread blocks/waits to read double
s from a user while another thread will receive those values and print information about what the user has typed in.
You may be tempted to use a single global double
variable that can be used to store the values communicated by the reader thread to the other thread.
However, if we only used a single double
variable, it would not support the case where the reader thread reads in multiple double
s before the other thread that uses those doubles has a chance to get the values.
So, before we can write a program support this, you will need to implement a datastructure to handle the transfer of information from one thread to another.
This data structure will be called a DoubleQueue
.
DoubleQueue
should only be used as a way to communicate the double
data from one thread to another.
The printer thread may need to hold onto multiple doubles at a time, but a fixed-length array should work best for us.
Instructions
Part 1
Step 0
We will use the “Quick & Dirty BMP Library” (QDBMP) for working with the bitmap images.
The qdbmp.h
, qdbmp.cc
, cqdbmp.h
and cqdbmp.c
, files in Codio are all you need for this assignment, but more information about this library is available at http://qdbmp.sourceforge.net
To get you started, we have provided a program called negative.cc
that uses the QDBMP library to read a bitmap image and produce its negative by subtracting each pixel’s R, G, and B values from 255 (the maximum value for each).
Make sure you are able to compile this program using the Makefile we have provided (by running “make negative” in the terminal), and that you can run it. The first argument to the program is the name of the bitmap file to be modified, and the second is the name of the file to be created.
For instance, if you run
$ ./negative ./test_files/doge.bmp negative.bmp
using the doge.bmp file in Codio, the output file, negative.bmp
, should look like this:
(Wow. So inverted. Much scary. Wow.)
There is nothing to submit for this step but be sure you are able to compile, run, and understand the program before proceeding.
Step 1
Before you attempt to write a multithreaded version of the blurring program, make sure you’re able to do it sequentially first.
Copy negative.cc
to a file called blur_sequential.cc
and modify the code so that it implements the box blur algorithm as described above, using a box size that is specified as a command-line argument.
If the box size is 0 or negative, display an error message and terminate the program.
You should be able to run your code after it is compiled by running:
$ ./blur_sequential <input_file> <output_file> <block_size>
for example:
$ ./blur_sequential doge.bmp doge_blur_8.bmp 8
Hint: make sure your algorithm uses the original RGB values of the pixels and not the modified values. That is, if you are calculating the new values of a pixel, and you’ve already blurred the pixels above it and to its left, you need to make sure your calculations use the original RGB values of those pixels, and not the blurred ones.
Depending on the size of the bitmap, you may need a box size of at least 8-10 to be able to see that your program is working.
Step 2
Now copy blur_sequential.cc
to blur_parallel.cc
in which you implement the box blur algorithm using multiple threads.
The number of threads should be specified as a command-line argument, after the box size. If the number of threads is 0 or negative, display an error message and terminate the program.
Each of the N threads should concurrently process approximately 1/N of the pixels. How you approach this is up to you, but be careful that no pixel is being processed by more than one thread. We WILL be checking your code manually after the due date to make sure each thread is doing roughly 1/N of the work. We are not expecting it to be divided perfectly since the number of pixels may not divide evenly into the number of threads.
Hint: Don’t forget to “join” on all the threads before writing the modified bitmap to a new file.
Your program should still produce the same output as in Step 1, but presumably, be faster (especially for large box sizes) if the threads are truly running in parallel. However, due to the limited computing power on Codio, the performance gain may not be obvious/repeatable. Therefore, the autograder does NOT evaluate your program based on performance gain.
You should be able to run your code after it is compiled by running:
$ ./blur_parallel <input_file> <output_file> <block_size> <thread_count>
for example:
$ ./blur_parallel doge.bmp doge_blur_8.bmp 8 4
Make sure you run Helgrind to look for race conditions in your program, and use mutex locks in order to make the program thread safe if it is needed, but still taking advantage of concurrency.
See the Helgrind section below for more details.
Part 2
Step 1
In this part of the assignment, you will write a simple multithreaded program in which one thread blocks/waits to read double
s from a user while another thread will receive those values and print information about what the user has typed in.
Before we write our overall multi-threaded program, we must first write a data structure to facilitate the communication of data (double
s) between threads.
To do this, you will have to implement the class DoubleQueue
.
You MUST use the <pthread.h>
library for synchronization.
DoubleQueue
acts as a queue of double
s, but it is thread-safe. We have specified the public declarations of the DoubleQueue
class, and a declaration for a private struct type that may be useful for your implementation. Note that we do not provide any data members for you to use, you should modify DoubleQueue.h
to add any private data members you wish to use (hint: one of them should be of type pthread_mutex_t
). More information on the object and behavior can be found in DoubleQueue.h
.
We have provided a test_suite
for the DoubleQueue so that you can verify the implementation works correctly, but you may want to run the test_suite
under helgrind before moving on to the next step.
Note that to have DoubleQueue
be thread safe, you MUST use a lock in your data structure. It is up to you to decide when to use it. You may also find the use of a condition variable to be useful, but it is not required to get the solution working.
Step 2
Create a C++ program in a file called numbers.cc that consists of two threads:
- The first thread continuously loops, prompting the user to input a (floating point) number via the keyboard, sending the number to the second thread, and then sleeping for 1 second.
- The second thread reads in user input, and then displays:
- The maximum of the last five values that were input
- The minimum of the last five values that were input
- The average of the last five values that were input
- a listing of the last five values that were input, in order, starting with the oldest of the five (if fewer than five values have been input so far, only show the values that have been input)
You may be tempted to use DoubleQueue
in the printer thread to store the last 5-doubles input by the user.
We highly recommend you DO NOT USE DoubleQueue
to do this.
DoubleQueue
should only be used to communicate values from reader thread to printer thread.
The printer thread may need to hold onto multiple doubles at a time, but a fixed-length array should work best for us.
To facilitate the creation of such a program, you will have to use the class DoubleQueue
implemented in the previous step.
You MUST use the <pthread.h>
library to create, start, and join threads.
We have provided a program on codio called sequential_numbers.cc
that contains a sequential implementation of the described program.
We highly suggest that you take inspiration from this example when writing your multithreaded version.
If the user enters an input that is not a floating-point number or integer, your program should simply ignore it (but should not crash, of course!). This includes non-numeric strings such as “dog” or “banana”, or empty strings. Although you may of course ask for clarification on this, please do not spend too much time worrying about the different possible inputs, as that is not the focus of this assignment.
Although the program should ignore non-numeric strings, it should end gracefully (like, not crash) when the user inputs the EOF signifier (ctrl + d) in the terminal. You should make sure that both threads know to clean up and terminate once the EOF has been found.
How you organize your code is up to you, in terms of what the main thread does and whether you create one or two additional threads, but you should not continuously be starting new threads.
That is, there should only be two executions of pthread_create
at most, and each should run its own loop: one is waiting for input from the user or sleeping; the other is waiting for new values and then printing out the values.
You should also make sure that the reader and writer threads are running at the same time, you cannot just create a reader thread, join it, and then create a writer thread.
The two threads can communicate via global variables, one of which should be your DoubleQueue
, but you may find that you need to add more global variables to support the safe and clean exit of the program once the EOF input has been detected.
To make sure that the autograder interacts with your code correctly, your reader thread must sleep()
for 1 second after adding a double
to the shared DoubleQueue
.
To check whether your code is threadsafe, use Helgrind http://valgrind.org/docs/manual/hg-manual.html to check that you do not have any concurrency errors. We will be running Helgrind against your program during grading, and it should pass with no problems reported.
See the Helgrind section below for more details.
Suggested Approach
Below we have provided a suggested approach to this homework. Note that you are not required to follow this ordering if you believe another approach would work better for you. Also note that you can gradually check your progress, and run specific tests. Look at the gtest and Testing section below for more details on running individual tests.
- Start by populating
DoubleQueue.cc
with “empty” definitions of every member function. Afterwards, make sure that you can compile successfully. For example, you would write the following “empty” function in DoubleQueue.cc:
double DoubleQueue::wait_remove() {
return 0.0;
}
-
Create an empty
numbers.cc
file, you can start by copyingsequential_numbers.cc
. At this point, you should make sure that everything compiles successfully when you runmake
. -
Compile, run and read
negative.cc
. Make sure it works and that you understand what is going on in that program. -
Copy
negative.cc
intoblur_sequential.cc
. From here, modify the code so that it takes in more command line args and blurs the image based on those args. -
Implement
blur_parallel.cc
. You can start from yourblur_sequential.cc
code and modify the code so that it takes in more command line args and splits the work across threads. This may work for you, but you may find it useful to write the program from scratch. -
Implement all of
DoubleQueue.cc
except forwait_remove()
and make sure you pass theTest_DoubleQueue.add_remove
tests. -
Implement
DoubleQueue::wait_remove()
and make sure you pass the wholetest_suite
. Make sure that you also pass thetest_suite
under helgrind with no errors. -
Implement
numbers.cc
and make sure that you can run it with no helgrind errors and that the program terminates properly when the user types inctrl + d
.
NOTE: Just becuase you pass a test mentioned in one of the steps above doesn’t guarantee that the function is now correct. Other tests may reveal errors that you may have to go back and fix.
Hints
Here are a few hints and tips that you may find useful when approaching this homework.
-
Don’t wait until the end to test. As mentioned in the suggested approach above, you can create “empty” definitions for functions, just enough to compile, so that you can test your other functions.
-
For
blur_sequential.cc
andblur_parallel.cc
make sure your algorithm uses the original RGB values of the pixels and not the modified values. That is, if you are calculating the new values of a pixel, and you’ve already blurred the pixels above it and to its left, you need to make sure your calculations use the original RGB values of those pixels, and not the blurred ones. -
For
blur_parallel.cc
you may find you need to specify more than one value to the thread you create. As a result, you may want to create a struct and pass that to a thread, which is done in the lecture examplebetter_locking.cc
. -
For
blur_parallel.cc
, don’t forget to “join” on all the threads before writing the modified bitmap to a new file. -
in
numbers.cc
the printer thread may need to hold onto multiple doubles at a time that it has already read from theDoubleQueue
. A fixed-length array should work best for your code. -
In
numbers.cc
the reader thread needs to let the printer thread know when the EOF is read in, so that both threads can exit. However, the printer thread may be waiting using theDoubleQueue::wait_remove
function. One way you can fix this is by having the reader thread send any value into theDoubleQueue
. This will cause the printer thread to wake up, which can then check for EOF before printing out any information.
Grading and Testing
Compilation
We have supplied you with a makefile
that can be used for compiling your code into an executable. To do this, open the terminal in codio (this can be done by selecting Tools -> Terminal) and then type in make
.
You may need to resolve any compiler warnings and compiler errors that show up. Once all compiler errors have been resolved, if you ls
in the terminal, you should be able to see various executable listed.
One of which is called test_suite
.
You can then run this by typing in ./test_suite
to see the evaluation of your DoubleQueue
.
How to run other executables is described below.
Note that your submission will be partially evaluated on the number of compiler warnings. You should eliminate ALL compiler warnings in your code.
Valgrind
We will also test your submission on whether there are any memory errors or memory leaks. We will be using valgrind to do this. To do this, you should try running:
valgrind --leak-check=full ./test_suite
You will also want to run valgrind on the other programs you write for this assignment, you can do this by running:
valgrind --leak-check=full ./numbers
valgrind --leak-check=full ./blur_sequential doge.bmp doge_blur_8.bmp 8
valgrind --leak-check=full ./blur_parallel doge.bmp doge_blur_8.bmp 8 4
Note that for blur_sequential
and blur_parallel
, you can run those programs under different command line args.
If everything is correct, you should see the following towards the bottom of the output:
==1620== All heap blocks were freed -- no leaks are possible
==1620==
==1620== For counts of detected and suppressed errors, rerun with: -v
==1620== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you do not see something similar to the above in your output, valgrind will have printed out details about where the errors and memory leaks occurred.
Helgrind
We will also test your submission on whether there are any thread errors.
We will be using the tool helgrind to do this, which is a tool under valgrind
To do this, you should try running something like:
valgrind --tool=helgrind ./test_suite
You can replace ./test_suite
with the other programs and their respective command line arguments
If everything is correct, you should see the following towards the bottom of the output:
==14004== For counts of detected and suppressed errors, rerun with: -v
==14004== Use --history-level=approx or =none to gain increased speed, at
==14004== the cost of reduced accuracy of conflicting-access information
==14004== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 216 from 42)
Your output may look slightly different, but the most important thing here is that it says 0 errors from 0 contexts
.
gtest
As with hw0, you can compile the your implementation by using the make
command. This will result in several output files, including an executable called test_suite
.
After compiling your solution with make
, You can run all of the tests for DoubleQueue
by invoking:
./test_suite
You can also run only specific tests by passing command line arguments into test_suite
There are only two tests in this test suite so your only options are:
./test_suite --gtest_filter=Test_DoubleQueue.add_remove
and
./test_suite --gtest_filter=Test_DoubleQueue.wait_remove
These settings can be helpful for debugging specific parts of the assignment, especially since test_suite
can be run with these settings through valgrind
and gdb
!
Testing
Testing Blur
To test your blur programs (both sequential and parallel), you should run your program on the doge.bmp
and duck.bmp
files provided in the test_files
directory with either 8 or 20 as the block size.
We have provided what the expected output in the test_files
directory to compare against.
To compare your bmp output with the provided ones, you can make use of the provided program compare_bmp.cc
.
Run the program with your output image and the appropriate example image to see how the images compare.
For example:
./compare_bmp my_blurred.bmp ./test_files/doge_blur_20.bmp
will compare your output with the expected output of blurring doge.bmp with bocksize equal to 20.
You should see Incorrect pixels is 0
in the output if your code output the correct image.
Testing DoubleQueue
For DoubleQueue, the test_suite
should be fine enough to verify the behavior.
You should run the test_suite
under helgrind though to verify that it is thread safe.
valgrind --tool=helgrind ./test_suite
Testing Numbers
To test your program numbers, you can compare the behaviour/output of it to the provided program, sequential_numbers.cc Additionally, we have provided a few sample inputs and outputs in test_files that you can use for testing purposes.
NOTE: if you have fully implemented the numbers.cc
program and output looks correct when you run the program, but it fails on the autograder, make sure that the reading thread sleeps for 1 second after adding a double to the double queue.
Submission:
Please submit your completed blur_sequential.cc
, blur_parallel.cc
, DoubleQueue.h
, DoubleQueue.cc
, and numbers.cc
to Gradescope
Note: In addition to the autograder, we will be manually reviewing the submissions for this homework assignment to ensure that you are using threads and that you are using them as described in the write-up. Just because you get full points on the test suite does not mean that you will get full points after we review your submissions manually.
Note: It is expected to take a while for this to run.