A Parser for Lu
For this problem, you will implement a parser for the Lu programming language.
> module LuParser where
Make sure that you read the LuSyntax
module that describes
the syntax of Lu before continuing. If you have questions, you
can also consult the Lu manual.
This problem also uses definitions from the Parsers module from the lecture
notes, gathered together in the module Parser.hs
. Operations
such as chainl1
and filter
are imported as P.chainl1
and P.filter
.
You should also familiarize yourself with this module before continuing.
The goal of this part of the exercise is to give you practice with the
operations in the Control.Applicative
library. As a result the Parser
type is not given a monad instance, so you will not be able use do
notation with it. Furthermore, you may not edit the Parser
module, and you
do not have access to the constructor for the Parser
type, so you will not
be able to define your own monad instance either.
> import Control.Applicative
> import qualified Data.Char as Char
> import LuSyntax
> import Parser (Parser)
> import qualified Parser as P
> import Test.HUnit (runTestTT,Test(..),Assertion, (~?=), (~:), assert, Counts)
> import qualified Test.QuickCheck as QC
Tesing your Parser
Your primary method of testing your parser should be using QuickCheck, though you will also want to define your own unit tests as you go.
In particular, the following "round tripping" properties should be satisfied by your implementation. These properties state that given an arbitrary Value/Expression/Statement, if we pretty print it
> prop_roundtrip_val :: Value -> Bool
> prop_roundtrip_val v = P.parse valueP (pretty v) == Right v
> prop_roundtrip_exp :: Expression -> Bool
> prop_roundtrip_exp e = P.parse expP (pretty e) == Right e
> prop_roundtrip_stat :: Statement -> Bool
> prop_roundtrip_stat s = P.parse statementP (pretty s) == Right s
More Parser combinators
As a warm-up, let's define a few helper functions that we can use later.
In general, so that our parsers are flexible about spaces that appear in source programs, all of the parsers will need to skip over any trailing white space.
First, define a parser combinator which takes a parser, runs it,
then skips over any whitespace characters occurring afterwards. HINT: you'll
need the space
parser from the Parser library.
> wsP :: Parser a -> Parser a
> wsP p = undefined
> test_wsP :: Test
> test_wsP = TestList [
> P.parse (wsP P.alpha) "a" ~?= Right 'a',
> P.parse (many (wsP P.alpha)) "a b \n \t c" ~?= Right "abc"
> ]
> -- >>> runTestTT test_wsP
Use this to define a parser that accepts only a particular string s
and consumes any white space that follows. The last test case ensures
that trailing whitespace is being treated appropriately.
> stringP :: String -> Parser ()
> stringP = undefined
> test_stringP :: Test
> test_stringP = TestList [
> P.parse (stringP "a") "a" ~?= Right (),
> P.parse (stringP "a") "b" ~?= Left "No parses",
> P.parse (many (stringP "a")) "a a" ~?= Right [(),()]
> ]
> -- >>> runTestTT test_stringP
Define a parser that will accept a particular string s
, returning a
given value x
, and also and consume any white space that follows.
> constP :: String -> a -> Parser a
> constP _ _ = undefined
> test_constP :: Test
> test_constP = TestList [
> P.parse (constP "&" 'a') "& " ~?= Right 'a',
> P.parse (many (constP "&" 'a')) "& &" ~?= Right "aa"
> ]
> -- >>> runTestTT test_constP
We will also use stringP
for some useful operations that parse between
delimiters, consuming additional whitespace.
> parens :: Parser a -> Parser a
> parens x = P.between (stringP "(") x (stringP ")")
> braces :: Parser a -> Parser a
> braces x = P.between (stringP "{") x (stringP "}")
> -- >>> P.parse (many (brackets (constP "1" 1))) "[1] [ 1] [1 ]"
> -- Right [1,1,1]
> brackets :: Parser a -> Parser a
> brackets x = undefined
Parsing Constants
Now let's write parsers for the Value
type, except for table constants
(which we won't parse).
> valueP :: Parser Value
> valueP = intValP <|> boolValP <|> nilValP <|> stringValP
To do so, fill in the implementation of the four parsers above. As above, these four parsers should consume any following whitespace. You can make sure that happens by testing 'many' uses of the parser in a row.
> -- >>> P.parse (many intValP) "1 2\n 3"
> -- Right [IntVal 1,IntVal 2,IntVal 3]
> intValP :: Parser Value
> intValP = undefined
> -- >>> P.parse (many boolValP) "true false\n true"
> -- Right [BoolVal True,BoolVal False,BoolVal True]
> boolValP :: Parser Value
> boolValP = undefined
> -- >>> P.parse (many nilValP) "nil nil\n nil"
> -- Right [NilVal,NilVal,NilVal]
> nilValP :: Parser Value
> nilValP = undefined
Lu literal strings are sequences of non-quote characters between double quotes. For simplicity, Lu does not allow escaped quote characters to appear in literal strings.
> stringValP :: Parser Value
> stringValP = undefined
> test_stringValP :: Test
> test_stringValP = TestList [
> P.parse stringValP "\"a\"" ~?= Right (StringVal "a"),
> P.parse stringValP "\"a\\\"\"" ~?= Right (StringVal "a\\"),
> P.parse (many stringValP) "\"a\" \"b\"" ~?= Right [StringVal "a",StringVal "b"],
> P.parse (many stringValP) "\" a\" \"b\"" ~?= Right [StringVal " a",StringVal "b"]
> ]
> -- >>> runTestTT test_stringValP
At this point you should be able to quickcheck the prop_roundtrip_val
property. You'll need to do this
in the terminal.
QC.quickCheck prop_roundtrip_val
Parsing Expressions
Next, let's parse some Lu expressions.
We've already stratified the grammar for you, so that we'll get the appropriate precedence and associativity for the binary and unary operators. Make sure to read the end of the parsers lecture to understand how this code works.
However, this code won't work until you complete all the parts of this section.
> expP :: Parser Expression
> expP = compP where
> compP = catP `P.chainl1` opAtLevel (level Gt)
> catP = sumP `P.chainl1` opAtLevel (level Concat)
> sumP = prodP `P.chainl1` opAtLevel (level Plus)
> prodP = uopexpP `P.chainl1` opAtLevel (level Times)
> uopexpP = baseP
> <|> Op1 <$> uopP <*> uopexpP
> baseP = tableConstP
> <|> Var <$> varP
> <|> parens expP
> <|> Val <$> valueP
>
> -- | Parse an operator at a specified precedence level
> opAtLevel :: Int -> Parser (Expression -> Expression -> Expression)
> opAtLevel l = flip Op2 <$> P.filter (\x -> level x == l) bopP
A variable is a prefix followed by some number of indexing terms or just a name. We've also done this one for you.
> -- >>> P.parse (many varP) "x y z"
> -- Right [Name "x", Name "y", Name "z"]
> -- >>> P.parse varP "(x.y[1]).z"
> -- Right (Dot (Var (Proj (Var (Dot (Var (Name "x")) "y")) (Val (IntVal 1)))) "z")
> varP :: Parser Var
> varP = mkVar <$> prefixP <*> some indexP <|> Name <$> nameP
> where
> mkVar :: Expression -> [Expression -> Var] -> Var
> mkVar e l = foldr1 (\f p u -> p (Var (f u))) l e
> prefixP :: Parser Expression
> prefixP = parens expP <|> Var . Name <$> nameP
> indexP :: Parser (Expression -> Var)
> indexP = flip Dot <$> (P.string "." *> nameP)
> <|> flip Proj <$> brackets expP
Define an expression parser for names. Names can be any sequence of upper and lowercase letters, digits and underscores, not beginning with a digit and not being a reserved word. Your parser should also consume any trailing whitespace characters.
> reserved :: [String]
> reserved = [ "and", "break","do","else","elseif","end"
> ,"false","for","function","goto","if","in"
> ,"local","nil","not","or","repeat","return"
> ,"then","true","until","while"]
> -- >>> P.parse (many nameP) "x sfds _ nil"
> -- Right ["x","sfds", "_"]
> nameP :: Parser Name
> nameP = undefined
Now write parsers for the unary and binary operators. Make sure you check out the manual or the LuSyntax module for the list of all possible operators. The tests are not exhaustive.
> -- >>> P.parse (many uopP) "- - #"
> -- Right [Neg,Neg,Len]
> uopP :: Parser Uop
> uopP = undefined
> -- >>> P.parse (many bopP) "+ >= .."
> -- Right [Plus,Ge,Concat]
> bopP :: Parser Bop
> bopP = undefined
Finally write a parser for table construction:
> -- >>> P.parse tableConstP "{ x = 2, [3] = false }"
> -- Right (TableConst [FieldName "x" (Val (IntVal 2)),FieldKey (Val (IntVal 3)) (Val (BoolVal False))])
> tableConstP :: Parser Expression
> tableConstP = undefined
At this point you should be able to quickcheck the prop_roundtrip_exp
property.
Parsing Statements
Finally, define a parser for statements ...
> statementP :: Parser Statement
> statementP = undefined
... and one for blocks.
> blockP :: Parser Block
> blockP = undefined
At this point you should be able to quickcheck the prop_roundtrip_stat
property.
Parsing Expressions and Files
Finally, we'll export these convenience functions for calling the parser.
> parseLuExp :: String -> Either P.ParseError Expression
> parseLuExp = P.parse expP
> parseLuStat :: String -> Either P.ParseError Statement
> parseLuStat = P.parse statementP
> parseLuFile :: String -> IO (Either P.ParseError Block)
> parseLuFile = P.parseFromFile (const <$> blockP <*> P.eof)
File-based tests
> tParseFiles :: Test
> tParseFiles = "parse files" ~: TestList [
> "fact" ~: p "lu/fact.lu" wFact,
> "test" ~: p "lu/test.lu" wTest,
> "abs" ~: p "lu/abs.lu" wAbs,
> "times" ~: p "lu/times.lu" wTimes,
> "table" ~: p "lu/table.lu" wTable,
> "bfs" ~: p "lu/bfs.lu" wBfs] where
> p fn ast = do
> result <- parseLuFile fn
> case result of
> (Left _) -> assert False
> (Right ast') -> assert (ast == ast')
Unit Tests
These unit tests summarize the tests given above.
> test_comb = "parsing combinators" ~: TestList [
> P.parse (wsP P.alpha) "a" ~?= Right 'a',
> P.parse (many (wsP P.alpha)) "a b \n \t c" ~?= Right "abc",
> P.parse (stringP "a") "a" ~?= Right (),
> P.parse (stringP "a") "b" ~?= Left "No parses",
> P.parse (many (stringP "a")) "a a" ~?= Right [(),()],
> P.parse (constP "&" 'a') "& " ~?= Right 'a',
> P.parse (many (constP "&" 'a')) "& &" ~?= Right "aa",
> P.parse (many (brackets (constP "1" 1))) "[1] [ 1] [1 ]" ~?= Right [1,1,1]
> ]
> test_value = "parsing values" ~: TestList [
> P.parse (many intValP) "1 2\n 3" ~?= Right [IntVal 1,IntVal 2,IntVal 3],
> P.parse (many boolValP) "true false\n true" ~?= Right [BoolVal True,BoolVal False,BoolVal True],
> P.parse (many nilValP) "nil nil\n nil" ~?= Right [NilVal,NilVal,NilVal],
> P.parse stringValP "\"a\"" ~?= Right (StringVal "a"),
> P.parse stringValP "\"a\\\"\"" ~?= Right (StringVal "a\\"),
> P.parse (many stringValP) "\"a\" \"b\"" ~?= Right [StringVal "a",StringVal "b"],
> P.parse (many stringValP) "\" a\" \"b\"" ~?= Right [StringVal " a",StringVal "b"]
> ]
> test_exp = "parsing expressions" ~: TestList [
> P.parse (many varP) "x y z" ~?= Right [Name "x", Name "y", Name "z"],
> P.parse varP "(x.y[1]).z" ~?= Right (Dot (Var (Proj (Var (Dot (Var (Name "x")) "y")) (Val (IntVal 1)))) "z"),
> P.parse (many nameP) "x sfds _ nil" ~?= Right ["x","sfds", "_"],
> P.parse (many uopP) "- - #" ~?= Right [Neg,Neg,Len],
> P.parse (many bopP) "+ >= .." ~?= Right [Plus,Ge,Concat],
> P.parse tableConstP "{ x = 2, [3] = false }" ~?=
> Right (TableConst [FieldName "x" (Val (IntVal 2)),FieldKey (Val (IntVal 3)) (Val (BoolVal False))])
> ]
> test_stat = "parsing statements" ~: TestList [
> P.parse statementP ";" ~?= Right Empty,
> P.parse statementP "x=3" ~?= Right (Assign (Name "x") (Val (IntVal 3))),
> P.parse statementP "if x then y=nil else end" ~?=
> Right (If (Var (Name "x")) (Block [Assign (Name "y") (Val NilVal)]) (Block [])),
> P.parse statementP "while nil do end" ~?=
> Right (While (Val NilVal) (Block [])),
> P.parse statementP "repeat ; ; until false" ~?=
> Right (Repeat (Block [Empty, Empty]) (Val (BoolVal False)))
> ]
> -- >>> runTestTT test_stat
Testing summary
> test_all :: IO Counts
> test_all = runTestTT $ TestList [ test_comb, test_value, test_exp, test_stat, tParseFiles ]
> -- >>> test_all
> qc :: IO ()
> qc = do
> putStrLn "roundtrip_val"
> QC.quickCheck prop_roundtrip_val
> putStrLn "roundtrip_exp"
> QC.quickCheck prop_roundtrip_exp
> putStrLn "roundtrip_stat"
> QC.quickCheck prop_roundtrip_stat