HW 1 - List Processing, Recursion and working with Higher-order Functions
This is the first homework assignment for CIS 5520. The first two problems provide practice with the basic built-in data structures of Haskell, including lists, tuples and maybes, as well as recursion and pattern matching. The second two problems provide practice with higher-order functions in Haskell. The last problem is a design exercise that puts everything together.
If you have not read the
Basics
module, and completed the associated quiz, you should do that first.
The homework assignment also draws from the HigherOrder
module.
Create your own private repo for the assignment by following these instructions.
This page is a "literate" Haskell program, meaning that explanation is
interspersed with actual Haskell code. To complete your assignment, edit HW01.hs
in the hw01
repository that you created and submit it through Gradescope.
When you work with this file, you can use a terminal to load the project into ghci with the command stack ghci
. That will give you
interactive access to all definitions in this module, as well as a function, called main
, that you can
use to run all of the test cases.
Alternatively, you can run the test cases with the command stack run
.
(For each of these commands, make sure that you are in the hw01
subdirectory in the terminal.)
> module HW01 where
> import Prelude hiding (reverse, concat, zip, (++), takeWhile, all)
> import Test.HUnit
> ( (@?=), (~:), (~=?), (~?=), runTestTT, Assertion, Test(TestList), assertFailure )
> -- libraries for Kata problem (only)
> import Text.Read (readMaybe)
> import qualified Data.List as List
> import qualified Data.Char as Char
> import qualified Data.Maybe as Maybe
This file starts by first declaring that we
are creating a module called HW01
and are using functions defined in the
modules
Prelude
,
Test.HUnit
,
Data.List
Data.Char
and
Text.Read
The Prelude
line imports all except for the functions listed (which you will
write). The module Prelude
is special in that it is always imported by
default, so the the point of this line is not to import more functions, but
rather to exclude a few functions. (Haskell does not allow functions to be
redefined in the same module.)
The Test.HUnit
line imports the specified functions and definitions from that module. Recall
that operators (i.e. binary functions with symbolic names) must be enclosed in parentheses when
they are named. Similarly, the import from Text.Read
only brings the readMaybe
function into scope.
The last three lines import all functions from the respective modules, but makes them available
with qualified names, such as List.intersperse
, etc. We import them in this way so that you
have the freedom to use any functions from these modules in your solution to the Kata problem.
> --------------------------------------------------------------------------------
> -- Problem (Good Style)
> --------------------------------------------------------------------------------
This is the test case for this problem. Recall that in Haskell, definitions can be in any order. This test has four subparts, defined below.
> testStyle :: Test
> testStyle = "testStyle" ~:
> TestList [ tabc , tarithmetic, treverse, tzip ]
All of the following Haskell code does what it is supposed to do (i.e. the tests pass), but it is difficult to read. Rewrite the following expressions so that they exactly follow the style guide. Be careful: the style guide includes quite a few rules, and we've broken most of them in what follows! (You don't need to rewrite the test following each part, but you do need to make sure that you don't break the code as you refactor it!)
NOTE: Do not change the name of any of the top level declarations below,
even if you think that they aren't very good (they aren't). We will be using
automatic testing to ensure that you do not break anything when you rewrite
these functions. On the other hand, local variables (such as function
parameters and those bound by let
and where
) can and should be renamed.
NOTE: If you have set up VSCode and hlint correctly, your IDE should give you a few hints on how to improve these functions. But, it won't tell you everything.
> abc x y z =
> if x then if y then True else
> if (x && z) then True else False
> else False
> tabc :: Test
> tabc = "abc" ~: TestList [abc True False True ~?= True,
> abc True False False ~?= False,
> abc False True True ~?= False]
> arithmetic :: ((Int, Int), Int) -> ((Int,Int), Int) -> (Int, Int, Int)
> arithmetic x1 x2 =
> let a = fst (fst x1) in
> let b = snd (fst x1) in
> let c = snd x1 in
> let d = fst (fst x2) in
> let e = snd (fst x2) in
> let f = snd x2
> in
> ((((((b*f) - (c*e)), ((c*
> d) - (a*f)
> ), ((a*e)-(b*d))))))
> tarithmetic :: Test
> tarithmetic = "arithmetic" ~:
> TestList[ arithmetic ((1,2),3) ((4,5),6) ~?= (-3,6,-3),
> arithmetic ((3,2),1) ((4,5),6) ~?= (7,-14,7) ]
> reverse l = reverseAux l [] where
> reverseAux l acc =
> if null l then acc
> else reverseAux (tail l) (head l : acc)
> treverse :: Test
> treverse = "reverse" ~: TestList
> [reverse [3,2,1] ~?= ([1,2,3] :: [Int]),
> reverse [1] ~?= ([1] :: [Int]) ]
> zip xs ys = g 0 xs ys where
> g n xs ys = if n == length xs || n == length ys then [] else
> (xs !! n, ys !! n) : g (n + 1) xs ys
> tzip :: Test
> tzip = "zip" ~:
> TestList [ zip "abc" [True,False,True] ~?= [('a',True),('b',False), ('c', True)],
> zip "abc" [True] ~?= [('a', True)],
> zip [] [] ~?= ([] :: [(Int,Int)]) ]
> --------------------------------------------------------------------------------
> -- Problem (List recursion)
> --------------------------------------------------------------------------------
Now define, debug and test the following functions that work with lists.
Some of these functions are part of the Haskell standard Prelude or standard
libraries like Data.List
. Their solutions are readily available online.
You should not google for this code: instead, implement them yourself!
For each part of this problem, you should replace the testcase for that part
based on the description in the comments. Make sure to test with multiple
inputs using TestList
. We will be grading your test cases as well as the
correctness and style of your solutions! HINT: your testing code should
include any tests that we give you in the the comments!
Do not use any library functions in this problem. These include
all functions from the Prelude or from Data.List
that take arguments
or returns a result with a list type. However, (:)
and []
are
data constructors for the list type, not functions, so you are free
to use them. Please also avoid list comprehension syntax, as it actually
de-sugars into list library functions!
> testLists :: Test
> testLists = "testLists" ~: TestList
> [tminimumMaybe, tstartsWith, tendsWith, ttranspose, tcountSub]
> -- | The 'minimumMaybe` function computes the mininum value of a
> -- nonempty list. If the list is empty, it returns Nothing.
> --
> -- >>> minumumMaybe []
> -- Nothing
> -- >>> minumumMaybe [2,1,3]
> -- Just 1
> minimumMaybe :: [Int] -> Maybe Int
> minimumMaybe = undefined
> tminimumMaybe :: Test
> tminimumMaybe =
> "minimumMaybe" ~: (assertFailure "testcases for minimumMaybe" :: Assertion)
> -- | The 'startsWith' function takes two strings and returns 'True'
> -- iff the first is a prefix of the second.
> --
> -- >>> "Hello" `startsWith` "Hello World!"
> -- True
> --
> -- >>> "Hello" `startsWith` "Wello Horld!"
> -- False
> startsWith :: String -> String -> Bool
> startsWith = undefined
> tstartsWith :: Test
> tstartsWith = "startsWith" ~: (assertFailure "testcase for startsWith" :: Assertion)
> --
> -- | The 'endsWith' function takes two lists and returns 'True' iff
> -- the first list is a suffix of the second. The second list must be
> -- finite.
> --
> -- >>> "ld!" `endsWith` "Hello World!"
> -- True
> --
> -- >>> "World" `endsWith` "Hello World!"
> -- False
> endsWith :: String -> String -> Bool
> endsWith = undefined
> tendsWith :: Test
> tendsWith = "endsWith" ~: (assertFailure "testcase for endsWith" :: Assertion)
> -- | The 'transpose' function transposes the rows and columns of its argument.
> -- If the inner lists are not all the same length, then the extra elements
> -- are ignored.
> -- You may assume that the input list is non-empty, and that each of the sublists
> -- is also non-empty.
> -- (i.e. we won't test your code on `transpose []` or `transpose [[]]`)
> -- Note, this function should *not* have the same behavior as the library version
> -- of transpose (i.e. the version of transpose from Data.List), which retains
> -- extra elements in the output.
> -- >>> transpose [[1,2,3],[4,5,6]]
> -- [[1,4],[2,5],[3,6]]
> -- >>> transpose [[3,4,5]]
> -- [[3],[4],[5]]
> -- >>> transpose [[1,2],[3,4,5]]
> -- [[1,3],[2,4]]
> -- (WARNING: this one is tricky!)
> transpose :: [[a]] -> [[a]]
> transpose = undefined
> ttranspose :: Test
> ttranspose = "transpose" ~: (assertFailure "testcase for transpose" :: Assertion)
> -- | The 'countSub' function returns the number of (potentially overlapping)
> -- occurrences of a substring sub found in a string.
> --
> -- >>> countSub "aa" "aaa"
> -- 2
> -- >>> countSub "" "aaac"
> -- 5
Hint: You can use other functions that you have defined in this file.
> countSub :: String -> String -> Int
> countSub = undefined
> tcountSub :: Test
> tcountSub = "countSub" ~: (assertFailure "testcase for countSub" :: Assertion)
> --------------------------------------------------------------------------------
> -- Problem (Defining higher-order functions)
> --------------------------------------------------------------------------------
Define, debug and test the following operations that take higher-order functions
as arguments. (For extra practice, you may define these operations using
foldr
, but that is not required.) Other than foldr
, you may
not use any list library functions for this problem.
> testHO :: Test
> testHO = TestList [ttakeWhile, tfind, tall, tmap2, tmapMaybe]
> -- | `takeWhile`, applied to a predicate `p` and a list `xs`,
> -- returns the longest prefix (possibly empty) of `xs` of elements
> -- that satisfy `p`.
> --
> -- >>> takeWhile (< 3) [1,2,3,4,1,2,3,4]
> -- [1,2]
> -- >>> takeWhile (< 9) [1,2,3]
> -- [1,2,3]
> -- >>> takeWhile (< 0) [1,2,3]
> -- []
> takeWhile :: (a -> Bool) -> [a] -> [a]
> takeWhile = undefined
> ttakeWhile :: Test
> ttakeWhile = "takeWhile" ~: (assertFailure "testcase for takeWhile" :: Assertion)
> -- | `find pred lst` returns the first element of the list that
> -- satisfies the predicate. Because no element may do so, the
> -- answer is returned in a `Maybe`.
> --
> -- >>> find odd [0,2,3,4]
> -- Just 3
> find :: (a -> Bool) -> [a] -> Maybe a
> find = undefined
> tfind :: Test
> tfind = "find" ~: (assertFailure "testcase for find" :: Assertion)
> -- | `all pred lst` returns `False` if any element of `lst`
> -- fails to satisfy `pred` and `True` otherwise.
> --
> -- >>> all odd [1,2,3]
> -- False
> all :: (a -> Bool) -> [a] -> Bool
> all = undefined
> tall :: Test
> tall = "all" ~: (assertFailure "testcase for all" :: Assertion)
> -- | `map2 f xs ys` returns the list obtained by applying `f` to
> -- to each pair of corresponding elements of `xs` and `ys`. If
> -- one list is longer than the other, then the extra elements
> -- are ignored.
> -- i.e.
> -- map2 f [x1, x2, ..., xn] [y1, y2, ..., yn, yn+1]
> -- returns [f x1 y1, f x2 y2, ..., f xn yn]
> --
> -- >>> map2 (+) [1,2] [3,4]
> -- [4,6]
> --
> -- NOTE: `map2` is called `zipWith` in the Prelude
> map2 :: (a -> b -> c) -> [a] -> [b] -> [c]
> map2 = undefined
> tmap2 :: Test
> tmap2 = "map2" ~: (assertFailure "testcase for map2" :: Assertion)
> -- | Apply a partial function to all the elements of the list,
> -- keeping only valid outputs.
> --
> -- >>> mapMaybe root [0.0, -1.0, 4.0]
> -- [0.0,2.0]
> --
> -- (where `root` is defined below.)
> mapMaybe :: (a -> Maybe b) -> [a] -> [b]
> mapMaybe = undefined
> tmapMaybe :: Test
> tmapMaybe = "mapMaybe" ~: (assertFailure "testcase for mapMaybe" :: Assertion)
> root :: Double -> Maybe Double
> root d = if d < 0.0 then Nothing else Just $ sqrt d
> --------------------------------------------------------------------------------
> -- Problem (map and foldr practice for lists)
> --------------------------------------------------------------------------------
For the next group of functions, you are not allowed to use explicit
recursion in your solutions. Instead, you must define them
using one of the higher-order functions map
, foldr
or para
(see
below). These are the only list library functions that you may use on this
problem. If you need any additional helper functions you may define them,
but any helper functions should also use map
, foldr
or para
instead of explicit recursion.
> testFoldr :: Test
> testFoldr = TestList [ tconcat, tstartsWithHO, tendsWithHO, ttails, tcountSubHO]
> -- | The concatenation of all of the elements of a list of lists
> --
> -- >>> concat [[1,2,3],[4,5,6],[7,8,9]]
> -- [1,2,3,4,5,6,7,8,9]
> --
NOTE: remember you cannot use any list functions from the Prelude
or
Data.List
for this problem, even for use as a helper function. Instead, define
it yourself.
> concat :: [[a]] -> [a]
> concat = undefined
> tconcat :: Test
> tconcat = "concat" ~: (assertFailure "testcase for concat" :: Assertion)
> -- | The 'startsWithHO' function takes two strings and returns 'True'
> -- iff the first is a prefix of the second. This is the same as `startsWith` above
> -- except this time you need to use `foldr` to define it.
> --
> -- >>> "Hello" `startsWithHO` "Hello World!"
> -- True
> --
> -- >>> "Hello" `startsWithHO` "Wello Horld!"
> -- False
NOTE: use foldr
for this one, but it is tricky! (Hint: the value
returned by foldr
can itself be a function.)
> startsWithHO :: String -> String -> Bool
> startsWithHO = undefined
> tstartsWithHO = "tstartsWithHO" ~: (assertFailure "testcase for startsWith" :: Assertion)
> -- INTERLUDE: para
Now consider a variant of foldr
called para
. In the case of cons,
foldr
provides access to the head of the list and the result of the fold
over the tail of the list. The para
function should do the same, but should
also provide access to the tail of the list (before it has been processed).
> -- | foldr variant that provides access to each tail of the list
> para :: (a -> [a] -> b -> b) -> b -> [a] -> b
> para _ b [] = b
> para f b (x:xs) = f x xs (para f b xs)
For example, consider the tails
function.
> -- | The 'tails' function calculates all suffixes of a give list and returns them
> -- in decreasing order of length. For example:
> --
> -- >>> tails "abc"
> -- ["abc", "bc", "c", ""],
> --
> tails :: [a] -> [[a]]
> tails [] = [[]]
> tails (x:xs) = (x:xs) : tails xs
It is a natural fit to implement tails
using para
. See if you can
redefine the function above so that the test cases still pass.
> tails' = undefined
> ttails :: Test
> ttails = "tails" ~: TestList [
> "tails0" ~: tails' "abc" ~?= ["abc", "bc", "c", ""],
> "tails1" ~: tails' "" ~?= [""],
> "tails2" ~: tails' "a" ~?= ["a",""] ]
> -- | The 'endsWithHO' function takes two lists and returns 'True' iff
> -- the first list is a suffix of the second. The second list must be
> -- finite.
> --
> -- >>> "ld!" `endsWithHO` "Hello World!"
> -- True
> --
> -- >>> "World" `endsWithHO` "Hello World!"
> -- False
NOTE: use para
for this one!
> endsWithHO :: String -> String -> Bool
> endsWithHO = undefined
> tendsWithHO :: Test
> tendsWithHO = "endsWithHO" ~: (assertFailure "testcase for endsWithHO" :: Assertion)
> -- | The 'countSubHO' function returns the number of (potentially overlapping)
> -- occurrences of a substring sub found in a string.
> --
> -- >>> countSubHO "aa" "aaa"
> -- 2
> -- >>> countSubHO "" "aaac"
> -- 5
(You may use the para
and startsWithHO
functions in countSubHO
.)
> countSubHO :: String -> String -> Int
> countSubHO = undefined
> tcountSubHO = "countSubHO" ~: (assertFailure "testcase for countSubHO" :: Assertion)
> --------------------------------------------------------------------------------
> -- Data Munging Kata
> --------------------------------------------------------------------------------
A Code Kata is an exercise that helps an experienced programmer hone their skills. The coding exercise is usually not difficult---what is important is the analysis and design of the problem as well and the practice and repetition that lead to good coding habits. This exercise comes from website devoted to Code Katas and is not specific to Haskell.
Unlike the exercises above, for this problem you are allowed to use functions from Haskell's standard libraries. In particular, you may use list functions from the Prelude, or from Data.List in your solution. You may also use functions from Data.Char and Data.Maybe.
This problem is an exercise in three parts to do with real world data. For that reason, we aren't expecting you to produce a robust solution. You can expect input that is in a similar format to the data files that we provide (same number and ordering of columns, same header and footer layout). However, your code should be able to deal with reasonable changes (different number of days in a month, different number of teams in the league).
However, remember that you shouldn't use partial functions. There shouldn't be
an input that causes your program to error. Definitely avoid functions such as
(!!)
, read
, or minimum
.
This problem also is about refactoring, so try hard not to read ahead---do each of the three parts below in turn and then reflect on your experience.
> -- Part One: Weather
In jul22.dat
(in the dat
subdirectory) you'll find daily weather data for
Philadelphia, PA for July 2022. This data is taken from
NOAA.
Your job is to write a program to output the day number (column one) with the smallest temperature spread (the maximum temperature is the second column, the minimum the third column). If there are multiple days with the same smallest spread, your program should return the first one.
> -- >>> weatherProgram "dat/jul23.dat"
> -- "16"
We've given you the I/O parts of the program---opening the file and then
printing the final result. You need to write the weather
function below,
that takes the string containing the text of the file and processes it to find
the answer. Your program should work for any text file with the same format as
this one. If the format is different, and your program cannot parse the data,
then it should return Nothing
. (We will discuss better approaches to error
handling later in the semester.)
> weather :: String -> Maybe String
> weather str = error "unimplemented"
> weatherProgram :: String -> IO String
> weatherProgram file = do
> str <- readFile file
> return (case weather str of
> Just result -> result
> Nothing -> "Cannot read file")
Hints: You should use the words
and lines
functions from the Haskell
Prelude to split up the lines and columns in a manner that is robust to
whitespace characters. You should also use the (overloaded) Read.readMaybe
function to help you convert strings into integers. We've given it a new
name and type signature to make it easier for you to use.
> -- | Use this function to parse Ints
> readInt :: String -> Maybe Int
> readInt = readMaybe
Here is the test case for this part. If this test fails because it cannot find
the input file, you need to use the :cd
command in ghci to make sure that
you are in the right directory.
> testWeather :: Test
> testWeather = TestList [
> "jul23" ~: do str <- readFile "dat/jul23.dat"
> weather str @?= Just "16",
> "jul22" ~: do str <- readFile "dat/jul22.dat"
> weather str @?= Just "26",
> "jul21" ~: do str <- readFile "dat/jul21.dat"
> weather str @?= Just "18",
> "jul20" ~: do str <- readFile "dat/jul20.dat"
> weather str @?= Just "10",
> "jul19" ~: do str <- readFile "dat/jul19.dat"
> weather str @?= Just "8"
> ]
> -- >>> runTestTT testWeather
> -- Counts {cases = 5, tried = 5, errors = 0, failures = 0}
> -- Part Two: Soccer League Table
The file dat/soccer22.dat
contains the results from the English Premier League
for the 2022/2023 season. This data is taken from
SkySports.The columns
labeled "W" and "L" contain the total number of wins and losses for
each team in that season (so Liverpool won 19 games against opponents and
lost nine). Write a program to find the name of the
team with the smallest (absolute) difference in "W" and "L". If there are multiple
teams with the smallest difference, your program should return the first one.
> soccer :: String -> Maybe String
> soccer = error "unimplemented"
> soccerProgram :: String -> IO String
> soccerProgram file = do
> str <- readFile file
> return $ case soccer str of
> Just result -> result
> Nothing -> "Cannot read file"
Your program should work with all similar input files (same columns, same info in footer).
> testSoccer :: Test
> testSoccer = TestList [
> "soccer22" ~: do
> str <- readFile "dat/soccer22.dat"
> soccer str @?= Just "Fulham",
> "soccer21" ~: do
> str <- readFile "dat/soccer21.dat"
> soccer str @?= Just "Leicester City",
> "soccer20" ~: do
> str <- readFile "dat/soccer20.dat"
> soccer str @?= Just "Aston Villa",
> "soccer19" ~: do
> str <- readFile "dat/soccer19.dat"
> soccer str @?= Just "Burnley",
> "soccer18" ~: do
> str <- readFile "dat/soccer18.dat"
> soccer str @?= Just "Everton"
> ]
> -- >>> runTestTT testSoccer
> -- Counts {cases = 4, tried = 4, errors = 0, failures = 0}
> -- Part Three: DRY Fusion
Now, take the two programs written previously and factor out as much common code as possible, leaving you with two smaller programs and some kind of shared functionality.
> weather2 :: String -> Maybe String
> weather2 = undefined
> soccer2 :: String -> Maybe String
> soccer2 = undefined
> -- Kata Questions
Fill in the strings below with your answers.
> -- To what extent did the design decisions you made when writing the original
> -- programs make it easier or harder to factor out common code?
> shortAnswer1 :: String
> shortAnswer1 = "Fill in your answer here"
> -- Was the way you wrote the second program influenced by writing the first?
> shortAnswer2 :: String
> shortAnswer2 = "Fill in your answer here"
> -- Is factoring out as much common code as possible always a good thing? Did the
> -- readability of the programs suffer because of this requirement? How about the
> -- maintainability?
> shortAnswer3 :: String
> shortAnswer3 = "Fill in your answer here"