MP3 – CPS Transform

Logistics

• revision: 1.2

Changelog

1.2

• New tests, clarifications and restructuring based on feedback

1.1

• Updated deadline; fixed parseExp and parseDecl examples; gave type for

cpsDecl.

Objectives

Please read through this entire README before you begin, as some of the early

sections may refer to content in later sections. View the PDF version for the

most accurate rendering of the notation.

One of the main objectives of this course is to make you comfortable translating

code representations from one form to another. In MP2, you wrote an interpreter

to evaluate expressions. In this MP you will translate a very small subset of

functional-style code from direct style into continuation passing style (CPS).

Functional-language compilers can use CPS to express control-flow so that

functional language can be compiled to an imperative target.

This will ensure that you understand CPS, and also help you be able to take a

mathematical description of a program transformation and implement it in code.

You will also practice using a method of propagating state by using a technique

known as plumbing.

Goals

• Be able to manually translate a chunk of simple functional code into CPS.

• Write some functions which can automatically translate simple functional

code into CPS.

1

Getting Started

Relevant Files

In the directory app you’ll find Main.hs with all the relevant code. In this

file you will find all of the data definitions, a simple parser, the REPL, and

stubbed-out functions for you to finish.

This README document contains LATEX formatting. To view it properly, look

at the README.pdf file you were given, instead of the README.md file that is

shown by default on your GitLab repo page.

Running Code

To run your code, start GHCi with stack ghci (make sure to load the Main

module if stack ghci doesn’t automatically). From here, you can test individual

functions, or you can run the REPL by calling main. Note that the initial $ and

> are prompts.

$ stack ghci

… More Output …

Prelude> :l Main

Ok, modules loaded: Main.

*Main> main

To run the REPL directly, build the executable with stack build and run it

with stack exec main.

Testing Your Code

As in MP2, you will be able to run the test-suite with stack test:

$ stack test

It will tell you which test-suites you pass, fail, and have exceptions on. To see

an individual test-suite (so you can run the tests yourself by hand to see where

the failure happens), look in the file test/Spec.hs.

You can run individual test-sets by running stack ghci –test and choosing

to load the Spec module when prompted. Then you can run the tests (specified

in test/Tests.hs) just by using the name of the test:

*Spec Main Spec Tests> tests_factk

[True,True,True,True,True,True]

Look in the file test/Tests.hs to see which tests were run.

2

Given Code

The Types

We have two types for this program. The first is called Stmt, and it corresponds

roughly to a Haskell function declaration. It has one constructor Decl ::

String -> [String] -> Exp -> Stmt. The first argument is the name of the

function, the second is a list of parameter names, and the last is the function

body (which is an Exp).

We have also provided a Show instance for the Stmt type, which means that

the function show :: Read a => a -> String is defined for our type. This

enables pretty-printing so that we can get useful output.

data Stmt = Decl String [String] Exp

deriving (Eq)

instance Show Stmt where

show (Decl f params exp) = f ++ ” ” ++ intercalate ” ” params ++ ” = ” ++ (show exp)

The second type we have is called Exp, which represents expressions. There are

six constructors that are encapsulated within Exp. We have IfExp, with the

usual three arguments; AppExp for function application (note: functions are only

applied to one of their arguments at a time); IntExp and VarExp for integers

and variables; an OpExp that takes an operator (as a string) with two operands;

and finally a LamExp to represent anonymous functions.

We’ve also provided a Show instance for the Exp type so that it can be prettyprinted as well.

data Exp = IntExp Integer

| VarExp String

| LamExp String Exp

| IfExp Exp Exp Exp

| OpExp String Exp Exp

| AppExp Exp Exp

deriving (Eq)

instance Show Exp where

show (VarExp s) = s

show (IntExp i) = show i

show (LamExp x e) = “(\\” ++ x ++ ” -> ” ++ (show e) ++ “)”

show (IfExp e1 e2 e3) = “(if ” ++ show e1 ++ ” then ” ++ show e2

++ ” else ” ++ show e3 ++ “)”

show (OpExp op e1 e2) = “(” ++ show e1 ++ ” ” ++ op ++ ” ” ++ show e2 ++ “)”

show (AppExp f e) = show f ++ ” ” ++ show e

You may also be interested in seeing how a string is parsed and interpreted

3

as something of type Exp. To see how, use the following function ctorParse

:: String -> String. This function will parse the expression normally, but

instead of using the default Show instance to display it, it will explicitely show

how the Exp is built from data-constructors.

ctorShow :: Exp -> String

ctorShow (VarExp s) = “VarExp ” ++ show s

ctorShow (IntExp i) = “IntExp ” ++ show i

ctorShow (LamExp x e) = “LamExp ” ++ show x ++ ” (” ++ ctorShow e ++ “)”

ctorShow (IfExp e1 e2 e3) = “IfExp (” ++ ctorShow e1 ++ “) (”

++ ctorShow e2 ++ “) (”

++ ctorShow e3 ++ “)”

ctorShow (OpExp op e1 e2) = “OpExp ” ++ show op ++ ” (”

++ ctorShow e1 ++ “) (”

++ ctorShow e2 ++ “)”

ctorShow (AppExp f e) = “AppExp (” ++ ctorShow f ++ “) (” ++ ctorShow e ++ “)”

fromParse :: Either ParseError Exp -> Exp

fromParse (Right exp) = exp

fromParse (Left err) = error $ show err

ctorParse :: String -> String

ctorParse = ctorShow . fromParse . parseExp

The Parser

We’ve provided you with a parser again this time. It will live in app/Main.hs

with the rest of the code.

First, we have a type synonym to make reading the types of parsers easier.

— Pretty parser type

type Parser = ParsecT String () Identity

Lexicals

Lexical parsers are just small parsers which look for very simple structure on

the input stream. You’ll notice that the return types of these parsers are all

simple types (like Parser String, or Parser Integer). This means that they

don’t have any meaning for our programming language yet, we’ll be using them

as building blocks for more complex parsers.

symbol :: String -> Parser String

symbol s = do string s

spaces

return s

4

int :: Parser Integer

int = do digits <- many1 digit <?> “an integer”

spaces

return (read digits :: Integer)

var :: Parser String

var = let keywords = [“if”, “then”, “else”]

in try $ do v1 <- letter <?> “an identifier”

vs <- many (letter <|> digit) <?> “an identifier”

spaces

let v = v1:vs

if (any (== v) keywords)

then fail “keyword”

else return v

oper :: Parser String

oper = do op <- many1 (oneOf “+-*/<>=”) <?> “an operator”

spaces

return op

parens :: Parser a -> Parser a

parens p = do symbol “(”

pp <- p

symbol “)”

return pp

Expressions

Now we can write some parsers that are actually relevant to our programming

language. You’re not expected to fully understand these parsers yet, or to be

able to reproduce them. For your next MP though, you will have to write a

simple parser, so it might be worth it to read through and see if it makes sense.

The majority of the complexity in our grammar is with expressions, which is

why the parsers for them are more complicated than for lexicals or statements.

This is apparent when we look at how many data-constructors we have for Exp,

we need to make sure our parser can handle each of them.

intExp :: Parser Exp

intExp = do i <- int

return $ IntExp i

varExp :: Parser Exp

varExp = do v <- var

return $ VarExp v

5

opExp :: String -> Parser (Exp -> Exp -> Exp)

opExp str = do symbol str

return (OpExp str)

mulOp :: Parser (Exp -> Exp -> Exp)

mulOp = opExp “*” <|> opExp “/”

addOp :: Parser (Exp -> Exp -> Exp)

addOp = opExp “+” <|> opExp “-”

compOp :: Parser (Exp -> Exp -> Exp)

compOp = try (opExp “<=”) <|> try (opExp “>=”)

<|> opExp “<” <|> opExp “>”

<|> opExp “/=” <|> opExp “==”

ifExp :: Parser Exp

ifExp = do try $ symbol “if”

e1 <- expr

symbol “then”

e2 <- expr

symbol “else”

e3 <- expr

return $ IfExp e1 e2 e3

lamExp :: Parser Exp

lamExp = do try $ symbol “\\”

param <- var

symbol “->”

body <- expr

return $ LamExp param body

atom :: Parser Exp

atom = intExp

<|> ifExp

<|> lamExp

<|> varExp

<|> parens expr

expr :: Parser Exp

expr = let arith = term `chainl1` addOp

term = factor `chainl1` mulOp

factor = app

app = do f <- many1 atom

return $ foldl1 AppExp f

in arith `chainl1` compOp

6

We’ve also provided the function parseExp :: String -> Either ParseError

Exp which can be used to see what the parser returns for a specific input string.

Combined with the function ctorParse that was introduced above, this is useful

for understanding how our abstract syntax tree (AST) data structures are being

used to store programs written in our language.

parseExp :: String -> Either ParseError Exp

parseExp str = parse expr “stdin” str

*Main> parseExp “x + 1”

Right (x + 1)

*Main> parseExp “x + 1 asdf*” — This parse will fail.

Left “stdin” (line 1, column 12):

unexpected end of input

expecting white space, an integer, “if”, “\\”, an identifier or “(”

*Main> putStrLn $ ctorParse “x + 1” — Show the AST structure.

OpExp “+” (VarExp “x”) (IntExp 1)

Declarations

Once we can parse expressions, declarations are comparatively simple.

decl :: Parser Stmt

decl = do f <- var

params <- many1 var

symbol “=”

body <- expr

return $ Decl f params body

Once again, we have provided the function parseDecl :: String -> Either

ParseError Stmt which can be used to investigate the connection between the

grammar of our language and abstract syntax tree of a particular program.

(We have not provided an equivalent of ctorParse for Stmt, but you could try

making your own as an exercise.)

parseDecl :: String -> Either ParseError Stmt

parseDecl str = parse decl “stdin” str

*Main> parseDecl “f x = x + 1”

Right f x = (x + 1)

*Main> parseDecl “f x = x + 1 asdf*” — This parse will fail.

Left “stdin” (line 1, column 18):

unexpected end of input

expecting white space, an integer, “if”, “\\”, an identifier or “(”

7

The REPL

The REPL can be used to enter CPS declarations and see the resulting translations. It will prompt you for string, pass it through the parseDecl parser,

check for success, and if so pass it through cpsDecl. Then the result will be

pretty-printed using the show :: Show a => a -> String function.

prompt :: String -> IO ()

prompt str = hPutStr stdout str >> hFlush stdout

printLn :: String -> IO ()

printLn str = hPutStrLn stdout str >> hFlush stdout

repl :: IO ()

repl = do input <- prompt “> ” >> getLine

case input of

“quit” -> return ()

_ -> do case parseDecl input of

Left err -> do printLn “Parse error!”

printLn $ show err

Right decl -> printLn . show $ cpsDecl decl

repl

main :: IO ()

main = do putStrLn “Welcome to the CPS Transformer!”

repl

putStrLn “GoodBye!”

Problems

Manual Translation

First, you’ll manually translate a few Haskell functions into CPS. This can help

you gain an intuition about how the automatic translator will work.

factk :: Integer -> (Integer -> t) -> t

Write the factorial function in continuation passing style. The first argument is

the number that it should compute the factorial of. The second argument is the

continuation to apply to the result.

Note that you will not receive credit if you calculate the factorial separately

using a normal factorial function then put it through the continuation. You

8

should instead recursively cal factk with an updated continuation which contains

the computation to make at this step in the recursion as well as the original

continuation.

*Main Lib> factk 10 id

3628800

evenoddk :: [Integer] -> (Integer -> t) -> (Integer -> t) -> t

Write the function evenoddk in continuation passing style.

The evenoddk function takes a list and two continuations. The first continuation,

if run, will receive the sum of all the even elements of the list. The second

continuation, if run, will receive the sum of all the odd elements of the list.

No additions should be performed until a continuation is called. This means

that you need to build up the addition to make in each continuation, not do the

addition as you process the list.

The function decides what to do when it gets to the last element of the list. If it

is even, it calls the even continuation, otherwise it calls the odd continuation.

You can assume the input list will have at least one entry in it.

*Main Lib> evenoddk [2,4,6,1] id id

1

*Main Lib> evenoddk [2,4,6,1,4] id id

16

*Main Lib> evenoddk [2,4,6,1,9] id id

10

Automated Translation

Here, you’ll define two functions do all the translation. We will call cpsDecl

on a declaration (Stmt). This function will return a new declaration with an

added argument k, and a body which has been CPS transformed to use that as

its continuation.

We’ll also have a function cpsExp which is used to translate an Exp into CPS.

It takes an extra argument, which is an integer, and we return an integer in

addition to the transformed expression. These integers are how we are going to

get fresh variables each time we want to generate a new continuation. They act

as a sort of accumulator, receiving the number for the next fresh variable and

returning a potentially updated number after the transformation is performed.

In particular, we have given you a utility function called gensym (for “generate

symbol”) that generates fresh variables each time it is called using that particular

integer. Whenever we need a fresh variable, we call gensym with our old integer.

9

gensym :: Integer -> (String, Integer)

gensym i = (“v” ++ show i, i + 1)

*Main Lib> gensym 20

(“v20”,21)

Notice that the return type for both gensym and cpsExp is a tuple of an expected

return type and an integer; this is how we remember what numbers have already

been used (and what comes next). This is a common pattern in languages that

do not have mutation (or even languages that do but we chose not to use it),

and is often called “plumbing.”

The CPS transform [[D]] represents a program transform taking declaration D

from direct style to continuation passing style. The CPS transform [[E]]k

represents a program transform taking expression E from direct style to continuation

passing style with k as its continuation. Remember that the continuation is a

function to apply to the result of the current expression. In this example, we

will call the continuation k on the value that expression E evaluates to.

We will also distinguish between simple expressions (no available function calls)

and normal expressions. (We will go into more detail later about how we will

distinguish simple from normal expressions.) Operators will be left in direct

style, and we will not be converting λ-expressions in this MP (though your code

will certainly emit them!).1

To indicate that an expression is simple, we will put an overbar on it, like this:

e. Some expressions, like variables and integers, are obviously simple so we will

omit the overbar in such a case. As an example, e1 + e2 indicates that e1 is

simple, but that e2 is not. Thus, e2 would need to be transformed.

Define isSimple

The isSimple function takes an expression and determines if it is simple or

not. A simple expression, in our context, is one that does not have an available

function call. A function call is available if it’s possible for the current expression

to execute it. In the language subset you have been given, a function call is

always available unless it is within the body of an unapplied λ-expression. We

aren’t transforming λ-expressions in this MP, so you can ignore implementing

isSimple over LamExps.

*Main Lib> isSimple (AppExp (VarExp “f”) (IntExp 10))

False

*Main Lib> isSimple (OpExp “+” (IntExp 10) (VarExp “v”))

1Most automatic CPS transforms do not make this distinction: there are, in fact, many

different CPS transforms, and researchers have spent a lot of time coming up with transforms

that have different properties. This means that if you find a paper describing a CPS transform,

odds are excellent it will not look exactly like this one. You should still find it recognizable

though!

10

True

*Main Lib> isSimple (OpExp “+” (IntExp 10) (AppExp (VarExp “f”) (VarExp “v”)))

False

Define cpsExp – Overview

You’ll need to define cpsExp over all of the data-constructors that make up the

Exp type. The first argument to cpsExp is the expression you are transforming,

and the second is the current continuation that has been built up. Here is the

type signature:

cpsExp :: Exp -> Exp -> Integer -> (Exp, Integer)

Remember that we have provided you with a parser and the function parseExp,

which you can use to type in expressions for testing. Instead of having to write

the entire Haskell AST of an expression, you can use parseExp.

Define cpsExp for Integer and Variable Expressions

[[i]]k = k i

[[v]]k = k v

In Haskell, this would (almost) be implemented as:

almostCpsExp :: Exp -> Exp -> Exp

almostCpsExp (IntExp i) k = AppExp k (IntExp i)

almostCpsExp (VarExp v) k = AppExp k (VarExp v)

But, remember that we have to add in the extra “plumbing” which keeps track of

how many fresh variables we’ve generated, which accounts for the extra Integer

in both the arguments and return of cpsExp.

Define cpsExp for Application Expressions

[[f e]]k = f e k

[[f e]]k = [[e]](λv.f v k) where v is fresh.

Note there are two rules: one for when the argument is simple, and one for when

the argument needs a conversion. We are going to assume that the argument is

run first, the result of which is passed into the function.

Remember that function application is left-associative. That is, assuming that

we use currying, f e k = ((f e) k). Also recall that in our terminology, we

“apply” functions to arguments. We do not “apply” arguments to functions.

11

Notice the “where v is fresh” part. Variables don’t grow moldy, but if you give

every continuation function a parameter named v, it won’t work. We could have

nested continuations which lead to unwanted α-capture. Therefore, we need to

generate new names whenever we make a new λ-expression. The safest strategy

is to make sure that no name is ever reused, regardless of which branch of logic

is eventually taken.

This should be done using the gensym :: Integer -> (String, Integer)

function explained above. Remember that cpsExp needs to return the most

up-to-date fresh variable number so that further calls to cpsExp don’t use the

same variables again.

Define cpsExp for Operator Expressions

These are the most complex, since there are four possible cases.

For ⊗ a binary operator:

[[e1 ⊗ e2]]k = k (e1 ⊗ e2)

[[e1 ⊗ e2]]k = [[e1]](λv.k(v⊗e2) where v is fresh.

[[e1 ⊗ e2]]k = [[e2]](λv.k(e1⊗v) where v is fresh.

[[e1 ⊗ e2]]k = [[e1]](λv1.[[e2]](λv2.k(v1⊗v2))) where v1 and v2 are fresh.

Notice that we are careful to preserve the order of the arguments to the operator

– the order we convert expressions to is very important (that is, e1 and e2 cannot

be swapped on the right-hand side of any of the above), because this preserves

the order of evaluation. In real life it doesn’t matter the order we convert them

in (that is, whether we choose to convert e1 or e2 first), as long as their results

are passed in the correct order, but to keep from annoying the graders2 please

evaluate things from left to right.

Define cpsExp for If Expressions

[[if e1 then e2 else e3]]k = if e1 then [[e2]]k

else [[e3]]k

[[if e1 then e2 else e3]]k = [[e1]](λv.if v then [[e2]]k

else [[e3]]k

) where v is fresh.

For an if expression with a not-simple guard, we need to transform the guard

and give it a continuation that selects the proper (transformed) branch for us.

If the guard is simple, we can leave it alone and transform the branches using

the original continuation.

Notice how the order of evaluation becomes explicit when we make this transform;

the condition of the if expression is evaluated first, and only then is one of the

2Annoying a grader is bad luck.

12

branches picked for evaluation. This is part of the power of CPS, you can choose

exactly how expressions in your language are simplified down.

Define cpsDecl

Declarations are much simpler than expressions because we don’t need to maintain

the state of a fresh-variable counter when translating declarations. This is because

each declaration has its own local variable scope anyway.

[[f x1 · · · xn = e]] = (f x1 · · · xn k = [[e]]k

)

This corresponds to the cpsDecl function. It adds an extra parameter k to the

parameter list, and then translates the body of the function. (The parentheses

on the right-hand side are added for clarity – the equality on the left tranforms

to the whole equality on the right.) Here is the type signature for cpsDecl:

cpsDecl :: Stmt -> Stmt

Monads (optional)

This part is highly optional.

Perhaps you are thinking “couldn’t we use a monad to get rid of this plumbing?”

If so, you are correct. There is a monad called the State monad that can do

this for us. (In fact, there is a GenSym monad as well!)

You are not required to use monads for this MP. But if you would like to use

them, you may, as long as you do not change the type of cpsDecl or cpsExp.

Define your own function cpsExpM, the monadic version of cpsExp, and you

can define cpsExp in terms of cpsExpM instead. The testing interface to cpsExp

must remain unchanged.

13

Sale!