HW 1 - Haskell List Processing and recursion
This is the first homework assignment for CIS 552. It provides practice with the basic built-in data structures of Haskell, including lists, tuples and maybes, as well as recursion and pattern matching. It also covers the basics of Haskell code style and test-driven development.
This file is a "literate" Haskell program, meaning that explanation is interspersed with actual Haskell code. To complete your assignment, download the zipfile, edit Main.hs
(this file contains just the code) and submit it to the link on the homework page.
Before we go any further, you should make sure that you can compile and run this code. Under emacs: C-c C-l
will load it interactively into ghci. You can then type main
to run the main routine. Alternatively, to compile the code from the command line type
ghc --make Main
which will produce an executable (named Main
, a.out
or Main.exe
depending on your system).
The first line of actual code for this homework assignment is a pragma for GHC (these are specific instructions for the compiler). GHC doesn't warn about some behaviors automatically, but they are required by the style guide so you want to know about them.
This next compiler option
turns type errors into warnings so that you can still run your code in ghci even if it doesn't type check. This flag doesn't defer all compile-time errors (such as errors for unbound variables) and you definitely should fix any type errors that it warns you about. However, having it in place helps make development more interactive. Of course, if you try to run the part of your code with a type error in it, it will fail.
When you are finished with the assignment, you should add the option -Werror
, which turns all warnings into errors, to the line above. We will not grade your assignment if there are any compiler warnings.
Next, we declare that we are creating a module called Main
and are using functions defined in the modules Prelude
, Test.HUnit
, Data.List
and Data.Char
.
The Prelude
line imports all except for the functions listed (which you will write). The module Prelude
is special in that it is always imported by default, so the the point of this line is not to import more functions, but rather to exclude a few functions. (Haskell does not allow functions to be redefined in the same module.)
The Test.HUnit
line imports all functions defined in that module. The line Data.List
imports all functions from that module, but makes them available with qualified names, such as List.intersperse
, etc.
> module Main where
> import Prelude hiding (reverse, concat, zip, (++))
> import Test.HUnit
> import qualified Data.List as List
> import qualified Data.Char as Char
> import qualified Data.Maybe as Maybe
> import qualified Text.Read as Read
The main "program" for this assignment runs the tests for each homework problem below. You should not edit this definition. Instead, your goal is to modify the problems below so that all of the tests pass. Note that the definitions in Haskell modules do not need to come in any particular order; here, the main function uses the definitions testStyle
, testLists
, etc, even though their definitions come much later in the file.
> main :: IO ()
> main = do
> _ <- runTestTT $ TestList [ testStyle,
> testLists,
> testWeather,
> testSoccer ]
> return ()
Now that we have the preliminaries out of the way, we can start the actual problems.
Problem (Good Style)
All of the following Haskell code does what it is supposed to do (i.e. the tests pass), but it is difficult to read. Rewrite the following expressions so that they exactly follow the style guide. Be careful: the style guide includes quite a few rules, and we've broken most of them in what follows! (You don't need to rewrite the test following each part, but you do need to make sure that you don't break the code as you refactor it!)
NOTE: do not change the name of any of the declarations below, even if you think that they aren't very good (they aren't). We will be using automatic testing to ensure that you do not break anything when you rewrite these functions.
> tabc :: Test
> tabc = "abc" ~: TestList [abc True False True ~?= True,
> abc True False False ~?= False,
> abc False True True ~?= False]
> arithmetic :: ((Int, Int), Int) -> ((Int,Int), Int) -> (Int, Int, Int)
> arithmetic x1 x2 =
> let a = fst (fst x1) in
> let b = snd (fst x1) in
> let c = snd x1 in
> let d = fst (fst x2) in
> let e = snd (fst x2) in
> let f = snd x2
> in
> ((((((b*f) - (c*e)), ((c*
> d) - (a*f)
> ), ((a*e)-(b*d))))))
> tarithmetic :: Test
> tarithmetic = "arithmetic" ~:
> TestList[ arithmetic ((1,2),3) ((4,5),6) ~?= (-3,6,-3),
> arithmetic ((3,2),1) ((4,5),6) ~?= (7,-14,7) ]
> reverse l = reverse_aux l [] where
> reverse_aux l acc =
> if null l then acc
> else reverse_aux (tail l) (head l : acc)
> treverse :: Test
> treverse = "reverse" ~: TestList [reverse [3,2,1] ~?= [1,2,3],
> reverse [1] ~?= [1] ]
> zip xs ys = g 0 xs ys where
> g n xs ys = if n == length xs || n == length ys then [] else
> (xs !! n, ys !! n) : g (n + 1) xs ys
> tzip :: Test
> tzip = "zip" ~:
> TestList [ zip "abc" [True,False,True] ~?= [('a',True),('b',False), ('c', True)],
> zip "abc" [True] ~?= [('a', True)],
> zip [] [] ~?= ([] :: [(Int,Int)]) ]
Problem (List library chops)
Define, debug and test the following functions. (Some of these functions are part of the Haskell standard prelude or standard libraries like Data.List
. Their solutions are readily available online. You should not google for this code: instead, implement them yourself.)
For each part of the assignment, you need to declare the type of the function, define it, and replace the testcase for that part based on the problem description. The declared type of each function should be the most general one. Make sure to test each of these functions with multiple inputs using TestList
. We will be grading your test cases as well as the correctness and style of your solutions!
> testLists :: Test
> testLists = "testLists" ~: TestList
> [tstartsWith, tendsWith, ttranspose, tconcat, tcountSub]
> -- | The 'startsWith' function takes two strings and returns 'True'
> -- iff the first is a prefix of the second.
> --
> -- >>> "Hello" `startsWith` "Hello World!"
> -- True
> --
> -- >>> "Hello" `startsWith` "Wello Horld!"
> -- False
>
> startsWith = undefined
> tstartsWith :: Test
> tstartsWith = "startsWith" ~: (assertFailure "testcase for startsWith" :: Assertion)
> -- | The 'endsWith' function takes two lists and returns 'True' iff
> -- the first list is a suffix of the second. The second list must be
> -- finite.
> --
> -- >>> "ld!" `endsWith` "Hello World!"
> -- True
> --
> -- >>> "World" `endsWith` "Hello World!"
> -- False
> tendsWith :: Test
> tendsWith = "endsWith" ~: (assertFailure "testcase for endsWith" :: Assertion)
> -- | The concatenation of all of the elements of a list of lists
> --
> -- >>> concat [[1,2,3],[4,5,6],[7,8,9]]
> -- [1,2,3,4,5,6,7,8,9]
> --
> -- NOTE: do not use any functions from the Prelude or Data.List for
> -- this problem, even for use as a helper function.
> -- | The 'transpose' function transposes the rows and columns of its argument.
> -- If the inner lists are not all the same length, then the extra elements
> -- are ignored. Note, this is *not* the same behavior as the library version
> -- of transpose.
> --
> -- >>> transpose [[1,2,3],[4,5,6]]
> -- [[1,4],[2,5],[3,6]]
> -- >>> transpose [[1,2],[3,4,5]]
> -- [[1,3],[2,4]]
> -- >>> transpose ([[]] :: [[Integer]])
> -- []
>
> -- transpose is defined in Data.List
> -- (WARNING: this one is tricky!)
>
> transpose = undefined
> ttranspose :: Test
> ttranspose = "transpose" ~: (assertFailure "testcase for transpose" :: Assertion)
> -- | The 'countSub' function returns the number of (potentially overlapping)
> -- occurrences of a substring sub found in a string.
> --
> -- >>> countSub "aa" "aaa"
> -- 2
> countSub = undefined
> tcountSub :: Test
> tcountSub = "countSub" ~: (assertFailure "testcase for countSub" :: Assertion)
Data Munging Kata
A Code Kata is an exercise that helps an experienced programmer hone their skills. The coding exercise is usually not difficult---what is important is the analysis and design of the problem as well and the practice and repetition that lead to good coding habits. This exercise comes from website devoted to Code Katas and is not specific to Haskell.
Unlike the exercises above, for this problem you are allowed to use functions from Haskell's standard libraries. In particular, you may use list functions from the Prelude, or from Data.List in your solution. You may also use functions from Data.Char.
This problem is an exercise in three parts to do with real world data. For that reason, we aren't expecting you to produce a robust solution. Your code must work only for the input test files that we provide. We won't test it on any other inputs! Consequently, you can ignore the style guideline that says that your functions must not be partial...
This problem also is about refactoring, so try hard not to read ahead---do each of the three parts below in turn and then reflect on your experience.
In jul19.dat
(in the zipfile) you'll find daily weather data for Philadelphia, PA NJ for July 2019. This data is taken from NOAA.
Download this text file, then write a program to output the day number (column one) with the smallest temperature spread (the maximum temperature is the second column, the minimum the third column).
We've given you the I/O parts of the program---opening the file and then printing the final result. Your job is to write the weather
function below, that takes the string containing the text of the file and processes it to find the answer. Your program should work for any text file with the same format as this one. If the format is different, the behavior is unspecified (your program can throw an error, return a garbage string, etc.). We will discuss better approaches to error handling later in the semester.
> weatherProgram :: IO ()
> weatherProgram = do
> str <- readFile "jul19.dat"
> putStrLn (weather str)
The (overloaded) Read.readMaybe
function can help you convert strings into integers. We've given it a new name and type signature to make it easier to use.
Here is the test case for this part. If this test fails because it cannot find the input file, you need to use the :cd
command in ghci to make sure that you are in the right directory.
> testWeather :: Test
> testWeather = "weather" ~: do str <- readFile "jul19.dat"
> weather str @?= "8"
The file soccer.dat
contains the results from the English Premier League for 2001/2. The columns labeled "F" and "A" contain the total number of goals scored for and against each team in that season (so Arsenal scored 79 goals against opponents, and had 36 goals scored against them). Write a program to print the name of the team with the smallest (absolute) difference in "for" and "against" goals.
Your program needs to only work with the soccer.dat file. If given a different file it may or may not throw an error.
> soccerProgram :: IO ()
> soccerProgram = do
> str <- readFile "soccer.dat"
> putStrLn (soccer str)
> testSoccer :: Test
> testSoccer = "soccer" ~: do
> str <- readFile "soccer.dat"
> soccer str @?= "Aston_Villa"
Now, take the two programs written previously and factor out as much common code as possible, leaving you with two smaller programs and some kind of shared functionality.
Fill in the strings below with your answers.
> -- To what extent did the design decisions you made when writing the original
> -- programs make it easier or harder to factor out common code?
> -- Is factoring out as much common code as possible always a good thing? Did the
> -- readability of the programs suffer because of this requirement? How about the
> -- maintainability?