Goals
- Introduction to C Strings code and more exposure to:
- Writing Header Files
- Writing a Makefile
- Dynamic Memory Allocation
- Strings and pointers in C
Collaboration
For assignments in CIS 2400, you will complete each of them on your own. You may discuss high-level ideas with other students, but any viewing, Sharing, copying, or dictating of code is forbidden. If you are worried about whether something violates academic integrity, please post on Ed or contact the instructor(s).
Setup
If you haven’t already, you need to follow the Docker Setup. We recommend you try and figure this out ASAP.
Once you have the environment set up, you should boot it up, and download the tar file here:
curl -o cstring.zip https://www.seas.upenn.edu/~cis2400/current/projects/code/cstring.zip
After downloading the zip file, you should be able to type in the command “ls” in the terminal, or use the file explorer to see the download tar file. Once you confirm that the file is downloaded, run the following command to decompress the files:
unzip cstring.zip
You should now have a directory called cstring
that contains the starter files for this assignment.
You will also need to create the files cstring.h
and cstring.c
and populate those files over the course of this assignment. We refer you to the beginning of the HW0 spec for how to create empty files in the terminal.
Instructions
Once you have followed the entire setup instructions, you should have a folder that contains the files for this assignment: Makefile
, test_suite.cpp
, catch.hpp
, catch.ccp
, test_cstring.cpp
, use_cstring.c
, cstring.c
and cstring.h
.
File Overview:
-
cstring.c
andcstring.h
: These start off as empty files you create and this is where you will be writing your string module -
Makefile
: used for compiling the code in this homework assignment, you will need to complete it. -
use_cstring.c
: a simple C file with amain()
function that usescstrtok_r
. You are entirely free to modify this file however you like, its sole purpose is to demonstrate some of howcstrtok_r
is called and to provide a space for you to call your cstring functions if you want to test them there. -
test_cstring.cpp
: Contains the tests for the cstring functions you will implement. Note that this is a C++ file and the test framework (Catch2) is in C++, but we kept this file to be written in C-style as much as possible. You should be able to read the file and figure out what is going on in it (for the most part) -
catch.hpp
,catch.cpp
andtest_suite.cpp
contain the core of the testing infrastructure so that we can run the tests that are written intest_cstring.cpp
. You do not need to open these files
Required Knowledge
This homework has you writing a small C module and will require knowledge of the following to complete:
- Compilation
- Makefiles
- Headerfiles
- Pointers
- Memory Allocation
- “strings” in C
Warning here again since it is a really common issue to come across: C doesn’t provide variables with a default initial value like Java does. When you declare a new variable be sure that you assign it a value.
Overview
For this assignment, you will be writing a module in C that supports our own implementation of some C string functions. You will also need to write an appropriate header file and modify the Makefile to compile your module. We also provide code that includes and tests your module.
It may also help to look at the lecture examples for creating a makefile and header file. (Makefiles to be covered in Lecture on 09/17 and Recitation on 09/18)
Task 1.
Your first task aftering getting setup and creating an empty cstring.c
and cstring.h
is to READ THE ENTIRE SPEC and then get your code setup to compile.
Note the “Suggested Approach” section towards the bottom of the spec.
Functions to Implement
cstrlen
Function Declaration:
unsigned int cstrlen(char *str);
Function Description:
Returns the length of the specified string str
, not counting the null-terminator.
E.g. cstrlen(“hi”) == 2U
cstrcpy
Function Declaration:
char* cstrcpy(char *dest, char *src);
Function Description:
Copies the string in src
into dest
.
Assumes that dest
has enough space to store all the characters copied over.
Returns dest
.
cstrdup
Function Declaration:
char* cstrdup(char *str);
Function Description:
Returns a pointer to a new string which is a duplicate of the string str
. Memory for the new string is obtained with malloc(3), and can be freed with free(3).
cstrchr
Function Declaration:
char* cstrchr(char *str, char target);
Function Description:
Returns a pointer to the first occurrence of the character target
in the string str
.
Return a pointer to the matched character or NULL if the character is not found.
The terminating null byte is considered part of the string, so that if target
is specified as ‘\0’, these functions return a pointer to the terminator.
cstrstr
Function Declaration:
char* cstrstr(char *str, char *target);
Function Description:
Finds the first occurrence of the substring target
in the string str
. The terminating null bytes (‘\0’) are not compared.
Return a pointer to the beginning of the located substring, or NULL if the substring is not found.
cstrpbrk
Function Declaration:
char* cstrpbrk(char *str, char *break_set);
Function Description:
Locates the first occurrence in the string str
of any of the bytes in the string break_set
.
Returns a pointer to the byte in str
that matches one of the bytes in break_set
, or NULL if no such byte is found.
cstrcmp
Function Declaration:
int cstrcmp(char *lhs, char *rhs);
Function Description:
Compares the two strings lhs
and rhs
.
Returns an integer indicating the result of the comparison, as follows:
- 0, if the lhs and rhs are equal;
- a negative value if lhs is less than rhs;
- a positive value if lhs is greater than rhs.
You can do this by comparing the ascii value of the characters.
Example:
-
cstrcmp("a", "b")
returns a negative value -
cstrcmp("ba", "b")
returns a positive value -
cstrcmp("a", "a")
returns0
cmemset
Function Declaration:
void *cmemset(void *s, unsigned char c, unsigned int n);
Function Description:
Fills the first n bytes of the memory area pointed to by s
with the constant byte c
.
Returns a pointer to the memory area s
.
cmemcpy
Function Declaration:
void *cmemcpy(void *dest, void *src, unsigned int n);
Function Description:
Copies n
bytes from memory area src
to memory area dest
. The memory areas must not overlap.
Returns a pointer to dest
.
cstrtok_r
Function Declaration:
char* cstrtok_r(char* input, char* delims, char** save_ptr);
Function Description:
Parses a string retrieving any non-empty tokens. On the first call to cstrtok_r
, the string to be parsed should be specified in str
. In each subsequent call that should parse the same string, the parameter must be NULL
.
The delims
argument specifies the set of bytes that delimit the tokens in the parsed string (e.g. the characters that we want to split on). delims
is allowed to change between any calls to cstrtok_r
Each call to cstrtok_r()
returns a pointer to a null-terminated string containing the next non-empty token. Each token returned will be a copy which has been allocated by malloc()
and should be later de-allcoated with free()
.
If no more tokens are found, NULL
is returned.
The saveptr
argument is a pointer to a char *
variable that is used internally by cstrtok_r()
in order to maintain context between successive calls that parse the same string.
saveptr
(and the buffer that it points to) should be unchanged by the caller between each call. The actual value of save_ptr
does not matter to the caller, it is just used as a way for the function to remember “where it left off” from the last time it called cstrtok_r()
. Usually the function will set *save_ptr
to a pointer to the first byte to check on the next call.
No global (or static
) variables are allowed or needed to get this function to work.
Consider the following example usage:
char* save_ptr; // caller does not care what value this is.
char* source = "aaa;;bbb,";
char* first = cstrtok_r(source, ";,", &save_ptr); // returns "aaa"
char* second = cstrtok_r(NULL, ";,", &save_ptr); // returns "bbb"
char* third = cstrtok_r(NULL, ";,", &save_ptr); // returns NULL
printf("%s %s %s\n", source, first, second); // prints "aaa;;bbb, aaa bbb"
free(first);
free(second);
Header File
Part of this assignment is creating a header file for the cstring
module.
Header files expose the “public” aspect of our module, by allowing other files to see what functions are declared in the module without directly including the .c code directly.
Notably: Header files should almost always only contain declarations or pre-processor macros. If you are putting a global variable or a function definition (implementation) in a header file, you are doing something wrong.
Header files also exist to help with how C wants to see a function declared or defined before it can be called. If we #include our own header file, we can be sure that we will be able to see all the function declarations and not need to worry about the order of their definition in the .c
file.
For our header file, you should start by creating the appropriate header guards. Afterwards, try putting a function declaration in the header file for each function we want to implement for that module. See the Function to Implement section on which string functions we want to implement in this assignment.
Makefile
You must also finish the Makefile we provide with this assignment. This is not as hard as it may sound, it should have been covered in one of the lectures at this point and we give you most of it to start. Your fixed Makefile should:
- Compile your
cstring.c
into acstring.o
whenevercstring.c
orcstring.h
are modified - Make the other necessary targets recompile when
cstring.h
is modified - Create the
use_cstring
target to compile theuse_cstring
executable. - Finish the
use_cstring.o
target to compile theuse_cstring.o
object file. - Use the same compiler we have been using in class
- compile with extra information for a debugger
- have the “enable all warnings” option turned on
You will need to complete this makefile to test your code and you will need to submit the makefile to gradescope.
Allowed Functions
For this assignment you are allowed to write any helper functions you need, but you are restricted to only using malloc
from stdlib.h
and stdbool.h
for the bool
type. Everything else should be code that you write yourself.
Notably: using the string.h
library will automatically fail the autograder. The whole point is that we are writing our own version of some of these commonly used functions.
If you do not see a function listed that you think should be ok to use, please ask and we can allow it or disallow it. We will most likely disallow it though.
Suggest Implementation Order:
- Download the setup files and create empty versions of
cstring.c
andcstring.h
. - Read through the entire spec
- Populate
cstring.h
with function declarations and appropriate header guards - Write “empty” implementations for each function and put them inside
cstring.c
. By empty we mean to have them do nothing but return NULL or 0. - Edit the
Makefile
so that it appropriately has a rule to makecstring.o
and that the other rules that neededcstring.o
will build correctly (test_suite
anduse_cstring
). Also add aclean
rule. Once it compiles, you should be able to run thetest_suite
and fail most tests. - Go back and fill in the implementations for the
cstring.c
functions. You should be able to test some of them as you go. Remeber to usevalgrind
to check for memory errors andgdb
is also your friend for debugging! - Submit the assignment :)
Testing
You can compile the your implementation by using the make
command once you finish the makefile. This will result in several output files, including an executable called test_suite
.
After compiling your solution with make
, You can run all of the tests for the homework by invoking:
./test_suite
You can also run only specific tests by passing command line arguments into test_suite
For example, to only run the cstrlen tests, you can type in:
./test_suite [cstrlen]
Note: you may have to type in ./test_suite \[cstrlen\]
for it to work.
You can specify which tests are run for any of the tests in the assignment. You just need to know the names of the tests, and you can do this by running:
./test_suite --list-tests
These settings can be helpful for debugging specific parts of the assignment, especially since test_suite
can be run with these settings through valgrind
and gdb
!
Valgrind
We will also test your submission on whether there are any memory errors or memory leaks. We will be using valgrind to do this. To do this, you should try running:
valgrind --leak-check=full ./test_suite
If everything is correct, you should see the following towards the bottom of the output:
==1620== All heap blocks were freed -- no leaks are possible
==1620==
==1620== For counts of detected and suppressed errors, rerun with: -v
==1620== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you do not see something similar to the above in your output, valgrind will have printed out details about where the errors and memory leaks occurred.
Submission
Please submit your completed cstirng.c
, cstring.h
, and Makefile
to Gradescope