CIT 5950 (Spring 2023) HW 02: Threads

Threads & Synchronization in C++.

Goals

In completing this assignment, you will learn how to:

Collaboration

For assignments in CIT 5950, you will complete each of them on your own or solo. However, you may discuss high-level ideas with other students, but any viewing, sharing, copying, or dictating of code is forbidden. If you are worried about whether something violates academic integrity, please post on Ed or contact the instructor.

Contents

Overview

Image Blurring

In this part of the assignment, you will implement a multithreaded solution to an “embarrassingly parallel” problem, i.e. one that can obviously be improved by doing parts of it concurrently.

Many image manipulation algorithms are “embarrassingly parallel” because they modify individual pixels or groups of pixels, and changes to one pixel can be made independently of changes to others.

As you may know, a bitmap (BMP) image is one in which the pixels of the image are arranged in a grid, and each pixel’s color is defined using a combination of red, green, and blue (RGB).

Here, you will implement a simple algorithm to make an image appear blurry by setting each pixel’s RGB color to the average of those around it.

The number of surrounding pixels to include in calculating the average is determined by the “box size” (this approach to blurring is known as a “box blur”), using an x-y coordinate system.

In the example below, we want to blur the pixel at coordinate (x, y). If the box size is 1, we consider all pixels in the square with (x-1, y-1) in the upper left corner and (x+1, y+1) in the lower right corner. If the box size is 2, we consider the square defined by (x-2, y-2) and (x+2, y+2).

blur_layout

In general, the RGB color for the pixel at (x, y) should be the average of the pixels in the square defined by (x-b, y-b) and (x+b, y+b), including the pixel itself, where b is the box size. However, if part of the box is outside the valid range of the image, you should only take the average of the pixels that are actually inside the image.

As an example, here’s a bitmap image that you can start with (it’s on Codio as doge.bmp):

doge

Here it is with a blur box of size 8:

doge_8

And with a blur box of size 20:

doge_20

(Much blurry. Wow. So fuzzy.)

Note that there are variations of the box blur algorithm that weigh the average based on how close the pixels are, or that make multiple blurring passes, but for simplicity, you only need to implement the algorithm as it is described above.

See the Instructions section below for details on the code files and how you should write some programs to do this

Sharing Data

In this part of the assignment, you will write a simple multithreaded program in which one thread blocks/waits to read doubles from a user while another thread will receive those values and print information about what the user has typed in. You may be tempted to use a single global double variable that can be used to store the values communicated by the reader thread to the other thread. However, if we only used a single double variable, it would not support the case where the reader thread reads in multiple doubles before the other thread that uses those doubles has a chance to get the values. So, before we can write a program support this, you will need to implement a datastructure to handle the transfer of information from one thread to another.

This data structure will be called a DoubleQueue.

Instructions

Part 1

Step 0

We will use the “Quick & Dirty BMP Library” (QDBMP) for working with the bitmap images. The qdbmp.h, qdbmp.cc, cqdbmp.h and cqdbmp.c, files in Codio are all you need for this assignment, but more information about this library is available at http://qdbmp.sourceforge.net

To get you started, we have provided a program called negative.cc that uses the QDBMP library to read a bitmap image and produce its negative by subtracting each pixel’s R, G, and B values from 255 (the maximum value for each).

Make sure you are able to compile this program using the Makefile we have provided (by running “make negative” in the terminal), and that you can run it. The first argument to the program is the name of the bitmap file to be modified, and the second is the name of the file to be created.

For instance, if you run

$ ./negative ./test_files/doge.bmp negative.bmp

using the doge.bmp file in Codio, the output file, negative.bmp, should look like this:

doge_negative

(Wow. So inverted. Much scary. Wow.)

There is nothing to submit for this step but be sure you are able to compile, run, and understand the program before proceeding.

Step 1

Before you attempt to write a multithreaded version of the blurring program, make sure you’re able to do it sequentially first.

Copy negative.cc to a file called blur_sequential.cc and modify the code so that it implements the box blur algorithm as described above, using a box size that is specified as a command-line argument. If the box size is 0 or negative, display an error message and terminate the program.

You should be able to run your code after it is compiled by running:

$ ./blur_sequential <input_file> <output_file> <block_size>

for example:

$ ./blur_sequential doge.bmp doge_blur_8.bmp 8

Hint: make sure your algorithm uses the original RGB values of the pixels and not the modified values. That is, if you are calculating the new values of a pixel, and you’ve already blurred the pixels above it and to its left, you need to make sure your calculations use the original RGB values of those pixels, and not the blurred ones.

Depending on the size of the bitmap, you may need a box size of at least 8-10 to be able to see that your program is working.

Step 2

Now copy blur_sequential.cc to blur_parallel.cc in which you implement the box blur algorithm using multiple threads.

The number of threads should be specified as a command-line argument, after the box size. If the number of threads is 0 or negative, display an error message and terminate the program.

Each of the N threads should concurrently process approximately 1/N of the pixels. How you approach this is up to you, but be careful that no pixel is being processed by more than one thread. We WILL be checking your code manually after the due date to make sure each thread is doing roughly 1/N of the work. We are not expecting it to be divided perfectly since the number of pixels may not divide evenly into the number of threads.

Hint: Don’t forget to “join” on all the threads before writing the modified bitmap to a new file.

Your program should still produce the same output as in Step 1, but presumably, be faster (especially for large box sizes) if the threads are truly running in parallel. However, due to the limited computing power on Codio, the performance gain may not be obvious/repeatable. Therefore, the autograder does NOT evaluate your program based on performance gain.

You should be able to run your code after it is compiled by running:

$ ./blur_parallel <input_file> <output_file> <block_size> <thread_count>

for example:

$ ./blur_parallel doge.bmp doge_blur_8.bmp 8 4

Make sure you run Helgrind to look for race conditions in your program, and use mutex locks in order to make the program thread safe if it is needed, but still taking advantage of concurrency.

See the Helgrind section below for more details.

Part 2

Step 1

In this part of the assignment, you will write a simple multithreaded program in which one thread blocks/waits to read doubles from a user while another thread will receive those values and print information about what the user has typed in. Before we write our overall multi-threaded program, we must first write a data structure to facilitate the communication of data (doubles) between threads. To do this, you will have to implement the class DoubleQueue.

You MUST use the <pthread.h> library for synchronization.

DoubleQueue acts as a queue of doubles, but it is thread-safe. We have specified the public declarations of the DoubleQueue class, and a declaration for a private struct type that may be useful for your implementation. Note that we do not provide any data members for you to use, you should modify DoubleQueue.h to add any private data members you wish to use (hint: one of them should be of type pthread_mutex_t). More information on the object and behavior can be found in DoubleQueue.h.

We have provided a test_suite for the DoubleQueue so that you can verify the implementation works correctly, but you may want to run the test_suite under helgrind before moving on to the next step.

Note that to have DoubleQueue be thread safe, you MUST use a lock in your data structure. It is up to you to decide when to use it. You may also find the use of a condition variable to be useful, but it is not required to get the solution working.

Step 2

Create a C++ program in a file called numbers.cc that consists of two threads:

To facilitate the creation of such a program, you will have to use the class DoubleQueue implemented in the previous step.

You MUST use the <pthread.h> library to create, start, and join threads.

We have provided a program on codio called sequential_numbers.cc that contains a sequential implementation of the described program. We highly suggest that you take inspiration from this example when writing your multithreaded version.

If the user enters an input that is not a floating-point number or integer, your program should simply ignore it (but should not crash, of course!). This includes non-numeric strings such as “dog” or “banana”, or empty strings. Although you may of course ask for clarification on this, please do not spend too much time worrying about the different possible inputs, as that is not the focus of this assignment.

Although the program should ignore non-numeric strings, it should end gracefully (like, not crash) when the user inputs the EOF signifier (ctrl + d) in the terminal. You should make sure that both threads know to clean up and terminate once the EOF has been found.

How you organize your code is up to you, in terms of what the main thread does and whether you create one or two additional threads, but you should not continuously be starting new threads. That is, there should only be two executions of pthread_create at most, and each should run its own loop: one is waiting for input from the user or sleeping; the other is waiting for new values and then printing out the values. You should also make sure that the reader and writer threads are running at the same time, you cannot just create a reader thread, join it, and then create a writer thread.

The two threads can communicate via global variables, one of which should be your DoubleQueue, but you may find that you need to add more global variables to support the safe and clean exit of the program once the EOF input has been detected.

To make sure that the autograder interacts with your code correctly, your reader thread must sleep() for 1 second after adding a double to the shared DoubleQueue.

To check whether your code is threadsafe, use Helgrind http://valgrind.org/docs/manual/hg-manual.html to check that you do not have any concurrency errors. We will be running Helgrind against your program during grading, and it should pass with no problems reported.

See the Helgrind section below for more details.

Suggested Approach

Below we have provided a suggested approach to this homework. Note that you are not required to follow this ordering if you believe another approach would work better for you. Also note that you can gradually check your progress, and run specific tests. Look at the gtest and Testing section below for more details on running individual tests.

  1. Start by populating DoubleQueue.cc with “empty” definitions of every member function. Afterwards, make sure that you can compile successfully. For example, you would write the following “empty” function in DoubleQueue.cc:
double DoubleQueue::wait_remove() {
  return 0.0;
}
  1. Create an empty numbers.cc file, you can start by copying sequential_numbers.cc. At this point, you should make sure that everything compiles successfully when you run make.

  2. Compile, run and read negative.cc. Make sure it works and that you understand what is going on in that program.

  3. Copy negative.cc into blur_sequential.cc. From here, modify the code so that it takes in more command line args and blurs the image based on those args.

  4. Implement blur_parallel.cc. You can start from your blur_sequential.cc code and modify the code so that it takes in more command line args and splits the work across threads. This may work for you, but you may find it useful to write the program from scratch.

  5. Implement all of DoubleQueue.cc except for wait_remove() and make sure you pass the Test_DoubleQueue.add_remove tests.

  6. Implement DoubleQueue::wait_remove() and make sure you pass the whole test_suite. Make sure that you also pass the test_suite under helgrind with no errors.

  7. Implement numbers.cc and make sure that you can run it with no helgrind errors and that the program terminates properly when the user types in ctrl + d.

Hints

Here are a few hints and tips that you may find useful when approaching this homework.

  1. Don’t wait until the end to test. As mentioned in the suggested approach above, you can create “empty” definitions for functions, just enough to compile, so that you can test your other functions.

  2. For blur_sequential.cc and blur_parallel.cc make sure your algorithm uses the original RGB values of the pixels and not the modified values. That is, if you are calculating the new values of a pixel, and you’ve already blurred the pixels above it and to its left, you need to make sure your calculations use the original RGB values of those pixels, and not the blurred ones.

  3. For blur_parallel.cc you may find you need to specify more than one value to the thread you create. As a result, you may want to create a struct and pass that to a thread, which is done in the lecture example better_locking.cc.

  4. For blur_parallel.cc, don’t forget to “join” on all the threads before writing the modified bitmap to a new file.

  5. in numbers.cc the printer thread may need to hold onto multiple doubles at a time that it has already read from the DoubleQueue. A fixed-length array should work best for your code.

  6. In numbers.cc the reader thread needs to let the printer thread know when the EOF is read in, so that both threads can exit. However, the printer thread may be waiting using the DoubleQueue::wait_remove function. One way you can fix this is by having the reader thread send any value into the DoubleQueue. This will cause the printer thread to wake up, which can then check for EOF before printing out any information.

Grading and Testing

Compilation

We have supplied you with a makefile that can be used for compiling your code into an executable. To do this, open the terminal in codio (this can be done by selecting Tools -> Terminal) and then type in make.

You may need to resolve any compiler warnings and compiler errors that show up. Once all compiler errors have been resolved, if you ls in the terminal, you should be able to see various executable listed. One of which is called test_suite. You can then run this by typing in ./test_suite to see the evaluation of your DoubleQueue. How to run other executables is described below.

Note that your submission will be partially evaluated on the number of compiler warnings. You should eliminate ALL compiler warnings in your code.

Valgrind

We will also test your submission on whether there are any memory errors or memory leaks. We will be using valgrind to do this. To do this, you should try running:

valgrind --leak-check=full ./test_suite

You will also want to run valgrind on the other programs you write for this assignment, you can do this by running:

valgrind --leak-check=full ./numbers

valgrind --leak-check=full ./blur_sequential doge.bmp doge_blur_8.bmp 8

valgrind --leak-check=full ./blur_parallel doge.bmp doge_blur_8.bmp 8 4

Note that for blur_sequential and blur_parallel, you can run those programs under different command line args.

If everything is correct, you should see the following towards the bottom of the output:

 ==1620== All heap blocks were freed -- no leaks are possible
 ==1620==
 ==1620== For counts of detected and suppressed errors, rerun with: -v
 ==1620== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If you do not see something similar to the above in your output, valgrind will have printed out details about where the errors and memory leaks occurred.

Helgrind

We will also test your submission on whether there are any thread errors. We will be using the tool helgrind to do this, which is a tool under valgrind To do this, you should try running something like: valgrind --tool=helgrind ./test_suite

You can replace ./test_suite with the other programs and their respective command line arguments

If everything is correct, you should see the following towards the bottom of the output:

==14004== For counts of detected and suppressed errors, rerun with: -v
==14004== Use --history-level=approx or =none to gain increased speed, at
==14004== the cost of reduced accuracy of conflicting-access information
==14004== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 216 from 42)

Your output may look slightly different, but the most important thing here is that it says 0 errors from 0 contexts.

gtest

As with hw0, you can compile the your implementation by using the make command. This will result in several output files, including an executable called test_suite.

After compiling your solution with make, You can run all of the tests for DoubleQueue by invoking:

./test_suite

You can also run only specific tests by passing command line arguments into test_suite

There are only two tests in this test suite so your only options are:

./test_suite --gtest_filter=Test_DoubleQueue.add_remove

and

./test_suite --gtest_filter=Test_DoubleQueue.wait_remove

These settings can be helpful for debugging specific parts of the assignment, especially since test_suite can be run with these settings through valgrind and gdb!

Testing

Testing Blur

To test your blur programs (both sequential and parallel), you should run your program on the doge.bmp and duck.bmp files provided in the test_files directory with either 8 or 20 as the block size. We have provided what the expected output in the test_files directory to compare against.

To compare your bmp output with the provided ones, you can make use of the provided program compare_bmp.cc. Run the program with your output image and the appropriate example image to see how the images compare. For example:

./compare_bmp my_blurred.bmp ./test_files/doge_blur_20.bmp

will compare your output with the expected output of blurring doge.bmp with bocksize equal to 20. You should see Incorrect pixels is 0 in the output if your code output the correct image.

Testing DoubleQueue

For DoubleQueue, the test_suite should be fine enough to verify the behavior. You should run the test_suite under helgrind though to verify that it is thread safe.

valgrind --tool=helgrind ./test_suite

Testing Numbers

To test your program numbers, you can compare the behaviour/output of it to the provided program, sequential_numbers.cc Additionally, we have provided a few sample inputs and outputs in test_files that you can use for testing purposes.

Submission:

Please submit your completed blur_sequential.cc, blur_parallel.cc, DoubleQueue.h, DoubleQueue.cc, and numbers.cc to Gradescope

Note: In addition to the autograder, we will be manually reviewing the submissions for this homework assignment to ensure that you are using threads and that you are using them as described in the write-up. Just because you get full points on the test suite does not mean that you will get full points after we review your submissions manually.

Note: It is expected to take a while for this to run.