Attribute Grammars with Monads

Attribute grammars evaluate recursive data structures where inherited attributes are passed from the root to the leaves, and synthesised attributes are calculated in the leaves and passed back to the root.

In “Beginning Haskell” pp343-4 (Apress ISBN 978-1-4302-6250-3), Alejandro Serrano Mena briefly discusses implementing attribute grammars using the Reader and Writer monads. He does not give a self-contained implementation, and the fragments he shows relate to a fairly complex running example.

Here I have implemented a simpler grammar (taken from the same chapter), which does some trivial arithmetic with the length of strings. (The oddly named type AmountOf makes more sense in its original context.)

The comments are notes for a DSL like UUAG or HappyAG.

SEE CORRECTION BELOW

{- pass through user language extensions -}

{- boiler plate -}
import Control.Monad.Reader
import Control.Monad.Writer
{- pass through user imports -}

{- declare grammar type (eg Expr) -}

{- declare result type (eg Result Int) and transform into -}
newtype Result = Result Int deriving Show
instance Semigroup Result where _ <> x = x
instance Monoid Result where mempty = undefined -- deliberate

{- Option: combine grammar and semantic functions into single declaration. Has pros and cons. -}
{- substitute grammar type into grammar definition -}
data Expr = Plus Expr Expr
          | Times Expr Expr
          | AmountOf [Char]

{- transform grammar type into -}
sem::Expr -> ReaderT Expr (Writer Result) ()

{- transform semantic functions
     Plus x y = (syn x) + (syn y)
     Times x y = (syn x) + (syn y)
     AmountOf x = length x
into -}
sem (Plus x y) = do (_,Result x') <- listen (sem x)
                    (_,Result y') <- listen (sem y)
                    let k = x' + y'
                    tell $ Result k

sem (Times x y) = do (_,Result x') <- listen (sem x)
                     (_,Result y') <- listen (sem y)
                     let k = x' * y'
                     tell $ Result k

sem (AmountOf x) = do tell $ Result (length x)

{- declare evaluator (eg executeExpression) and tranform into -}
executeExpression::Expr -> Result
executeExpression e = execWriter (runReaderT (sem e) e)

{- pass through native Haskell code, substituting grammar and result types and evaluator -}
example::Expr
example = let a = AmountOf "Alice"
              b = AmountOf "Bob" in
          Plus (Plus a b) (Plus (Times a a) (Times b b))

main = print $ executeExpression example

The pattern is clear:

  1. declare a result type with a fake monoid instance† where the semigroup operation returns the second operand
  2. declare the expression grammar
  3. create functions implementing the semantics of each subtype of the grammar, using recursive calls to pass inherited attributes to child nodes, ‘listen’ to get synthesised attributes from child nodes, and ‘tell’ to pass synthesised attributes to parent nodes.
  4. call execWriter/runReader

† According to the monoid laws, x <> mempty should be x. If it is defined correctly, the program runs and would pass the most exacting code review, but gives wrong answers.

The program looks simple enough to implement with a plain evaluation function, without the monad apparatus. One snag is that the result type is not part of the expression grammar, so a function eval::Expr->Int cannot handle nested expressions that are already evaluated to Int. A separate channel for returning the evaluations is required.

Postscript: I noted that the program does not use ‘ask’ or ‘local’, and found that it was possible to completely remove the Reader monad. This grammar is too simple to require it, and I need to find a better example.

Correction The above code works, but does not represent an attribute grammar. I had misunderstood two things:

  1. executeExpression should return a function ::String->String->Result Int rather than a Result
  2. Reader monad should replace parameter passing.
import Control.Monad.Reader
import Control.Monad.Writer

newtype Result = Result Int deriving Show
instance Semigroup Result where _ <> x = x
instance Monoid Result where mempty = undefined -- deliberate

data Expr a = Plus (Expr a) (Expr a)
            | Times (Expr a) (Expr a)
            | AmountOfL
            | AmountOfR deriving Show

data P = P {l::String, r::String} -- record of parameters

sem::Expr P -> ReaderT P (Writer Result) ()

sem (Plus x y) = do (_,Result x') <- listen (sem x)
                    (_,Result y') <- listen (sem y)
                    let k = x' + y'
                    tell $ Result k

sem (Times x y) = do (_,Result x') <- listen (sem x)
                     (_,Result y') <- listen (sem y)
                     let k = x' * y'
                     tell $ Result k

sem AmountOfL = do p<-ask
                   tell $ Result (length (l p))

sem AmountOfR = do p<-ask
                   tell $ Result (length (r p))

executeExpression :: Expr P -> P -> Result
executeExpression e p = execWriter (runReaderT (sem e) p)

example::Expr a
example = let l = AmountOfL
              r = AmountOfR in
          Plus (Plus l r) (Plus (Times l l) (Times r r))

main = print $ executeExpression example $ P {l="Alice",r="Bob"}

AmountOfL and AmountOfR, to measure the lengths of the first and second parameters, have to be separate types, because no parent node can know which will be called by child nodes (it could be both).

It is straightforward to generate this code from a grammar definition, at the cost of some complexity. I need to investigate the performance.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: