Goals
- Practice with Objects in C++
- Practice with std::string, std::vector, const, references, and other fundamental aspects of C++.
Collaboration
For assignments in CIT 5950, you will complete each of them on your own or solo. However, you may discuss high-level ideas with other students, but any viewing, sharing, copying, or dictating of code is forbidden. If you are worried about whether something violates academic integrity, please post on Ed or contact the instructor.
Setup
Task 1. Setup the development environment for the course.
For this assignment, you need to setup a Linux C++ development environment. Please follow the instructions here: Enivronment setup
Instructions
You can downlowd the starter files into your docker container by running the following command:
curl -o simplekv.zip https://www.seas.upenn.edu/~cit5950/current/projects/code/simplekv.zip
You can also download the files manually here if you would like: simplekv.zip
From here, you need to extract the files by running
unzip simplekv.zip
From here you can either open the project in Vim or VSCode.
For Vim, you just need to run
cd simplekv
vim SimpleKV.hpp
For VSCode you will have to follow steps similar to what we did to open chcek_setup
in the setup document.
Here is a brief overview of the files:
Files that you will edit:
-
SimpleKV.hpp
: Contains the declarations and extensive comments detailing the functions you will have to implement for this assignment. You will also have to edit this file to choose which private data members the SimpleKV object will contain. -
SimpleKV.cpp
: Where you will be implementing all of the functions you need for the assignment. You will need to write every function that is specified inSimpleKV.hpp
Compilation & Test Files
-
test_basic.cpp
: This is the first file for testing out some basic functionality in your code -
test_stress.cpp
: This is the second file for a more thorough and long test of your simplekv implementation. -
Makefile
: Used for compiling your project, just type “Make” in the terminal and your project should build. -
test_suite.cpp
: You do not need to open this file, it is part of setting up the tests for the project. -
catch.hpp
: You do not need to open this file, this is part of the implementation for the C++ testing framework called “Catch2”. -
catch.cpp
: You do not need to open this file, this is part of the implementation for the C++ testing framework called “Catch2”.
We recommend reading all of the files for a part before your start that part, especially SimpleKV.hpp
.
All of the code you will submit for this assignment should be contained in SimpleKV.cpp
and SimpleKV.hpp
. This means that your code should work with the other files as they are when initially supplied.
Note that you can modify the other files if you like, but we will be testing your SimpleKV.cpp
and SimpleKV.hpp
against un-modified versions of the files.
Overview
In this assignment we will be implementing a C++ object called SimpleKV
that supports the functionality of a Key-Value store.
This object will be very to a “Dictionary” or “Map” data structure but with some slight alterations.
The core of what we are implementing is a map<string, string>
(A map from string keys to string values). There is more functionality on top of this, but this is the base of it.
You should be familar with the Map data structure from CIT 5910 already, and I recommend youbut I will breifly go over it again here just in case.
If you feel comfortable with the idea of a map<string, string>
then you can move on to the next section: Name Spaces, but I will breifly go over what a map is here.
In a key value store, we store pairs called key value pairs. That is, we store key and values. We can then look up values based on their key, similarly to how we can look up values in an array or list based on their index.
Consider the following example code that uses a map:
We want to store assignments and their corresponding grade,
so we can store that information in a map
(or unordered_map
);
// creates a map to store assignment names
// and their associated score is
map<string, string> grades {};
// add new assignments and their scores to the map
grades["hw0"] = "B+";
grades["hw1"] = "A-";
grades["hw2"] = "C-";
grades["hw3"] = "C+;
grades["hw4"] = "F";
grades["project"] = "A+";
// a key can only exist once in a map, so this next line
// updates the score for "hw4" to be a new value and
// does not insert a new key called "hw4"
grades["hw4"] = "A+";
// check to see if there is an assignment called "pennos"
// inside the map
bool pennos_exists = grades.contains("pennos"); // should return false
// print out all of the assignments and their grades
// hw0: B+
// hw1: A-
// hw2: C-
// hw3: C+
// hw4: A+
// project: A+
for (auto& assignment: grades) {
cout << assignment.first << ": " << assignment.second << endl;
}
for more information on how maps work, you can take a look at these slides from CIT 5910 that covered maps
Namespaces
Each key-value pair we store in the SimpleKV object will also belong to a specific “namespace”.
A namespace is a concept that exists beyond this assignment. Generally namespaces are a way to separate names so that items can be uniquely identified by their name. Namespaces are a way to prevent conflicts when multiple items have the same name.
Consider the following C++ code:
void foo() {
int x = 3;
int x = 5;
}
In this example, the code would not compile because we try to declare two integer variables of the same name. We have a name conflict in the name of these variables. In this example, we could say we are trying to put the same name twice in the same namespace, which would be a conflict.
Consider the following similar code:
int x = 1;
void foo() {
int x = 3;
}
void bar() {
int x = 10;
}
Again we try and delcare the variable x multiple times, but each time we do it in a separate scope. This is similar to the idea of separate namespaces, where we have the variable “x” but each declaration is in a different namespace. The compiler (and programmer) can distinguish between different “x” variables based on whether it is “the x in foo” or “the x in bar”.
Lets see how this applies to our idea of a map data structure. Consider we again wanted to store assignments and their corresponding grade, but this time, we want to store not only the grades for CIT 5950, but also for another course CIT 5940.
// creates a map to store assignment names
// and their associated score
map<string, string> grades {};
// add all of the assignments for CIT 5950:
grades["hw0"] = "B+";
grades["hw1"] = "A-";
grades["hw2"] = "C-";
grades["hw3"] = "C+;
grades["hw4"] = "F";
grades["project"] = "A+";
// add all of the assignments for CIT 5940:
grades["hw0"] = "B+";
grades["hw1"] = "A-";
grades["hw2"] = "A-";
grades["hw3"] = "B;
grades["hw4"] = "D";
grades["hw5"] = "A";
// print out all of the assignments and their grades
// hw0: B+
// hw1: A-
// hw2: A-
// hw3: B
// hw4: D
// hw5: A
// project: A+
for (auto& assignment: grades) {
cout << assignment.first << ": " << assignment.second << endl;
}
In this example, we try to store the assignments for both classes in the same map, and we have name conflicts for some of the assignments, meaning that some of the assignments for 5950 have their grade overwritten. We can fix this by introducing a namespace. Here I will demonstrate how we would solve this problem using the SimpleKV object you will be implementing in this assignment
SimpleKV kv{};
// add all the CIT 5950 grades to the
// "CIT 5950" namespace
// sset takes in a namespace, a key and a value
kv.sset("CIT 5950", "hw0", "B+");
kv.sset("CIT 5950", "hw1", "A-");
kv.sset("CIT 5950", "hw2", "C-");
kv.sset("CIT 5950", "hw3", "C+");
kv.sset("CIT 5950", "hw4", "F");
kv.sset("CIT 5950", "project", "A+");
// add all the CIT 5940 grades to the
// "CIT 5940" namespace
// sset takes in a namespace, a key and a value
kv.sset("CIT 5940", "hw0", "B+");
kv.sset("CIT 5940", "hw1", "A-");
kv.sset("CIT 5940", "hw2", "A-");
kv.sset("CIT 5940", "hw3", "B");
kv.sset("CIT 5940", "hw4", "D");
kv.sset("CIT 5940", "hw5", "A");
In SimpleKV
, we store our key-values in an associated namespace to avoid the name
conflict issue. This is a feature you will have to support when you implement SimpleKV.
List Values
Another feature that we support with SimpleKV, is that the values can be either strings or lists of strings. Note that this is an exclusive “or”, a value cannot be both a string and a list of strings. Keys and namespace names can only be strings. This adds complication since you will need some way of storing values that are of either type and supporting operations on either type.
Documentation for what exactly can be done with a list type can be found in the comments in SimpleKV.hpp
Handling two differnt types of values
Your kv needs to be able to hold values that are either string or list (never both at the same time).
Keys and namespace names can only be strings, only values may be lists or strings.
This can be a bit complicated to do, and there are many ways to do it. One way is to maintain two different
private data structures that keep track of list values and string values separately. Another is to use the variant
type that was briefly discussed in class.
You are free to do either of these (or something else), but this is something you should take into consideration when desigining your code.
Allowed Objects & Functions
You are restricted to using the #include
’s mentioned in lecture.
you may ask if a specific include is allowed on Ed if you are unsure.
<iostream>
<string>
<vector>
<unordered_map>
<unordered_set>
<map>
<set>
<optional>
<variant>
Suggested Ordering
One of the first things you will have to do in this assignment is choose which data structures you will need to implement the SimpleKV. We provide a header file for you, but it does not contain any data members, and you will have to fill that out.
We highly recommend you read all of the spec and read through most of the header file to get an idea for what you need to do in the assignment. This will also help you decide which data structure you can use. As a hint, I recommend taking a look at the Allowed Objects & Functions section above. You are limited to those data structure, think about wht combination of them will you need. (remember that you can put data structures inside of other data structures, e.g. map with vector values)
From here, we recommend implementing a couple functions in combination: sget
, sset
, namespaces
, keys
, ns_exists
, key_exists
, and del
.
You should be able to pass a some of the basic tests, in particular, running
$ ./test_suite [basic]
should pass all tess.
Once you have that passing, implement lpush
, lmembers
and type
. You should then pass a few more tests
$ ./test_suite [basic list]
From here, gradually implement functions untill you are passing the [complex list]
, and [stress]
tests.
From here, you should be able to pass all tests in the testsuite if you just run ./test_suite
Grading, Testing and Checking
We will be evaluating your submission mostly automatically. Specifically, we are checking for the following automatically:
- that you have no compilation errors or warnings
- that catch2 passes all tests
- that you have no valgrind errors
- that clang-tidy and clang-format contain no errors
More details on each of these are in the below sections
We will also briefly check submission by hand to make sure you do not do anything weird to pass the tests with an incorrect solution.
Compilation
We have supplied you with a makefile
that can be used for compiling your code into an executable. To do this, open the terminal and then type in make
.
You may need to resolve any compiler warnings and compiler errors that show up. Once all compiler errors have been resolved, if you ls
in the terminal, you should be able to see an executable called test_suite
. You can then run this by typing in ./test_suite
to see the evaluation of your code.
Note that your submission will be partially evaluated on the number of compiler warnings. You should eliminate ALL compiler warnings in your code.
It is expected for the first time you copmile your code to take up to 10 minutes. This is mostly due to test_stress.cpp
.
You should not need to recompile this file, so it should not take so long again.
Valgrind
We will also test your submission on whether there are any memory errors or memory leaks. We will be using valgrind to do this. To do this, you should try running:
valgrind --leak-check=full ./test_suite
If everything is correct, you should see the following towards the bottom of the output:
==1620== All heap blocks were freed -- no leaks are possible
==1620==
==1620== For counts of detected and suppressed errors, rerun with: -v
==1620== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you do not see something similar to the above in your output, valgrind will have printed out details about where the errors and memory leaks occurred.
Note: It is expected to take a while for valgrind to run on the whole test_suite
. You can possible go faster by only running valgrind on tests.
See the testing section above for information on running individual tests.
Clang Format and Clang Tidy
The makefile we provided with this assignment is configured to help make sure your code is properly formatted and follows good (modern) C++ coding conventions.
To do this, we make use of two tools: clang-format
and clang-tidy
To make sure your code identation is nice, we have clang format
. All you need to do to use this utility, is to run make format
, which will run the tool and indent your code propely.
Code that is turned in is expected to follow the specified style, code that is ill-formated must be fixed and re-submitted.
clang-tidy
is a more complicated tool. Part of it is checking for style and readability issues but that is not all it does. Examples of readability issues include:
Not using curly braces around if statements and loops:
if (condition) // clang-tidy will complain about missing curly braces
cout << "hello!" << endl;
Declaring variables or parmaters with names that are too short:
void foo(char c) { // clang-tidy will complain about the name `c`
// does something
}
Having functions that are too complex and long. The tool calculates “cognitive complexity” of your code and will complain about anything that is too complex.
This means you should think about how to break your code into helpers, because if you don’t, clang-tidy
will complain and you will face a deduction.
More on this specific error can be found here: Cognitive Complexity
clang-tidy
is also useful for noticing some memory errors and pointing out bad practices when writing C++ code.
Because of all this, we are enforcing that your code does not produce any clang-tidy
errors. You can run clang-tidy on your code by running: make tidy-check
.
Whenever you compile your code using make
then it should also re-run clang-tidy
to check your code for errors.
Note that you will have to fix any compiler errors before clang-tidy will run (and be useful).
Code that has clang-format
, clang-tidy
or any compiler errors will face a deduction.
If you have any questions about understanding an error, please ask on Ed discussion and we will happily assist you.
Submission
Please submit your completed SimpleKV.hpp
, and SimpleKV.cpp
to Gradescope