home—lectures—exams—hws—breeze (snow day)

hw07
Interpreting N
project

The following tests are due Nov.02 (Fri).
The full N1,N2 (racket and Java, plus any add'l tests) due Nov.05 (Mon).

Over the course of several homeworks, we'll implement a “tower” of languages (N0, N1, …, N6) each language incorporating more features than the previous. N0 is provided for you. You'll implement N1 in both Racket and Java; after that you choose which of the two languages you'll use to complete N2, …, N6.

For each language, you must implement three methods:
- parse, which turns source-code into an internal representation,
- toString which turns internal representation back into source-code,
- eval, which actually interprets the program, returning a value (a number for N0-N2, and a number-or-function in N3).
I recommend implementing and testing the methods in this order. See the provided test-cases, for examples. There might be other helper functions to implement as well.
Your solution may be in Java or in Racket (or see me, if you want to use a different language). (The exception is N1, which you must implement in both Racket and Java.)
You do not need to worry about error-checking -- we will only concern ourselves with syntactic N0 (etc.) programs.
(As a matter of defensive programming, you might still want to add some assertions/sanity-checks in your code.)
Use the Desire2Learn discussion board for group questions and answers.
This project is based on the first chapters of Programming Languages and Interpretation, by Shriram Krishnamurthi. As a result, this homework assignment is covered by the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. Although we're using a different dialect of scheme than that book, you might find it helpful to skim it.

The language N0

  Expr      ::= Number | ParenExpr | BinExpr | Is0Expr
  ParenExpr ::= ( Expr ) 
  BinExpr   ::= ( Expr BinOp Expr )
  Is0Expr   ::= if Expr is0 then Expr else Expr ;
  BinOp     ::= plus | minus | times

where Number is any numeric literal (as written in either Java or Racket, your choice ¹) and Var is string with no whitespace or punctuation (parentheses, “;”) and is not otherwise a reserved word in N0 (i.e. a terminal in the above grammar -- “if”, “plus”, “is0”, etc.). Whitespace is required between all terminals/non-terminals, with the exception of punctuation.

Here is a Racket interpreter for N0 along with helper functions for a scanner for scheme and an optional test harness.
Here is a java interpreter for N0 (N0-java.jar or browse); it includes unit tests, and helpful Java parsing functions.

(10pts) N1 is just like N0, but with two additional types of expressions:
Expr ::= … | IsNegExpr IsNegExpr ::= if Expr isNeg then Expr else Expr ; BinOp ::= … | mod
Update parse, toString (a.k.a. expr->string), and eval appropriately, for both the scheme and Java implementations. Be sure to write test cases first.

The only method which cares what these new expressions mean (their semantics) is eval: if Expr₀ isNeg then Expr₁ else Expr₂; first evaluates just Expr₀; if that value is negative, then it evaluates Expr₁ and returns its value; otherwise it evaluates Expr₂ and returns its value. (Note how you are implementing short-circuit semantics for isNegExpr!)

(a mod b) should evaluate to ² a mod b, where the result is always between 0 (inclusive) and b (exclusive); in particular, the result is never positive if b<0. Notice that this is slightly different behavior than either Java's built-in % (which behaves differently on negative numbers), and from Racket's built-in modulo (which only accepts integers). In Java, you can use a%b (if b>0 or a%b=0) and a%b+b (otherwise). In scheme or Java, you can use b*(a/b-floor(a/b))

Note that you are provided sufficient test cases for modExprs, except that you have to translate it proper N0. You must make your own test cases for ifNegExprs; include at least two as-simple-as-possible tests, and two tests with more deeply nested Exprs.
(30pts) N2 adds identifiers to N1:
Expr ::= … | Id | LetExpr LetExpr ::= let Id = Expr in Expr;
where Id can be any series of letters and digits which isn't interpretable as a number³. (Assume for now that any nested let expressions use different Ids. We'll handle that possibility in N3, later.)

Update your three methods parse, toString (a.k.a. expr->string), eval. We now need to define the semantics of let Id be (E₀) in (E₁):
- Evaluate E₀, calling the result v₀.
- Then, substitute v₀ for all occurrences of Id inside E₁; name the result of the substitution E′.
- Evaluate E′, and return that result.
Observe that when evaluating a (legal) N2 program, eval will never actually encounter an Id -- that Id will have been substituted out before we ever recur down to it.

The code to make a substitution in an Expr parse-tree ⁴ is similar to taking an Ancestor-tree, and replacing every blue-eyed Child with a green-eyed one. (If you want to write that AncTree function, it may you understand what's needed for this one. The only difference is that an AncTree had only two cond-branches, while Expr has around seven, though the code for most of those are very similar). For example: let x = 5 in (x plus 3) ⇒ (5 plus 3) ⇒ 8⁵. Be sure to write test cases for your substitution function before you write its code; include several trivial and easy tests, along with a couple of more complicated nestings and one deeply nested expression.

Note: The following is provided so that you can see where we are headed. It might be slightly revised before being officially issued, though the substance will remain the same.

(20pts) N3 is just like N2, except we now allow one variable to shadow another. Consider an expression let Id = (E₀) in (E₁) where E₁ contains some non-free (“bound”) occurrences of the same Id?
For example, we might have a let x = … which in turn contains another let x = … inside of it. In that case the inner x should shadow the outer one:
let x = 3 in let x = 5 in (x plus 3);; ⇒ let x = 5 in (x plus 3); ⇒ (5 plus 3) ⇒ 8
Thus, when substituting, only substitute “free occurrences” of an Id in E₁, not any “bound occurrences”⁶. Think through exactly what you need to substitute, in the following examples:
let y = 3 in let x = 5 in (x plus y);; ⇒ ⇒ ⇒ 8 let y = 3 in let x = y in (x plus y);; ⇒ ⇒ ⇒ 6
let x = 5 in let y = 3 in (let x = y in (x plus y); plus x);; ⇒ ⇒ ⇒ ⇒ ⇒ 11
(You might find it helpful to draw the parse tree, rather than staring at the flat string, especially for this last example.)

Make test cases for parse (optional), toString and eval involving the N3 equivalents of the following scheme expressions:
(let {[x 5]} (let {[x 3]} x)) (let {[x 5]} (let {[x (+ x 1)]} x)) (let {[x 5]} (let {[y 3]} (+ (let {[x y]} (+ x y)) x))) (let {[x (let {[x 5}} x)]} (+ x 4))
(30pts) N4 adds functions and function-application to our language:
Expr ::= … | FuncExpr | FuncApplyExpr FuncExpr ::= func(Id, Expr) FuncApplyExpr ::= <Expr(Expr)>
We restrict ourselves to unary functions (functions of one argument). (To think about: if we wanted to 'fake' functions of 2 arguments in N4, how could we do it?)

Just like numbers are self-evaluating, so are FuncExprs. If evaluating (an internal representation of) a FuncExpr, just return that same (internal representation of the) function. We won't actually evaluate the body until the function is applied.

In FuncApplyExpr, the first Expr had better evaluate to a function. (That is, it might be a FuncExpr, or an Id which gets substitued to a function value. It could also be (say) an is0Expr or letExpr which evaluates to a function.) For example, here is a function for computing absolute value:
func( x, if x isNeg then (-1 times x) else x;)
A funcApplyExpr represents calling a function. Here are two expressions, both computing the absolute value of -5:
<func( x, if x isNeg then (-1 times x) else x;)(-5)> let abs = func(x, if x isNeg then (-1 times x) else x;) in <abs(-5)>;
To evaluate a function-application <Expr₀( Expr₁)>:
- Evaluate Expr₀; let's call the result f. (f had better be a function-value!)
- Evaluate Expr₁; let's call the result arg.
- Substitute f's parameter with arg in f's body; call this new expression E′.
- Evaluate E′ and return that value.
Hey, those semantics are practically the same as let!

Observe that our programs can now evaluate to either of two types: numbers or functions. In Java, we'll need to have a class which can represent either, as the return type for our eval.

You can paste/convert the N3 java test cases into the same format as the ones you've been using at the bottom of N0.rkt.

Make test cases for parse (optional), toString and eval involving making and calling the following functions:
- A constant function that always returns (say) 17.
- The absolute value function given above,
- the function sqr, which squares its input,
- The N4 equivalent of the following scheme definition make-adder:
  (define (make-adder n) (lambda (m) (+ n m))) ; Two examples of applying this function: ; (make-adder 4) ; evals to (lambda (m) (+ 4 m)) ((make-adder 3) 4) ; evals to 7
(35pts: This problem and the next are really the crux of the assignment.)
Deferred evaluation: N5 doesn't add any new syntax, but it is a different evaluation model which will give us more expressive semantics. Use a new file/project for N5, separate from N0-N4.

There are two problems ⁷ with the substitution approach used above: we can't make recursive functions, and it doesn't generalize to assigning to variables. We solve these problems with deferred substitution:
- When interpreting a program, eval will take two arguments: the program to interpret, and a set of bindings. A binding is just an Id and its associated value; a set of bindings (an environment) might be implemented with an association list or a java.util.Map<IdExpr,ValExpr>.
- In this approach, function-application and let no longer do any substitution — instead, each introduce (exactly one) new binding and proceed recursively ⁸.
- Upon encountering an Id, we just look up its value in our list of bindings. (Recall that in N2, eval never actually reached an Id; in N4 it does.)
- For full credit, make sure that you can evaluate recursive functions.
- You should not need to be doing any parse-tree-substitutions any more (though you can if you want, to temporarily address the note below). Note that in N6, no substitutions will be allowed.
A step sideways: This algorithm as described lets us add recursive functions, but it also doesn't handle some expressions that N4 did! For example, let make-adder = func(m, func(n, (n plus m)) ) in <<make-adder(3)>4>; gives an error "unbound identifier: m" if no substitution has been done, however <func(m, <func(n, (n plus m))(3)>)(4)> does still work just fine(!). The problem will be fixed in N6: in the first example, <make-adder(3)> returns a function whose body involves m and n, but not the binding of m to 3. We'd need return the function and its bindings.

Note that this gives us dynamic scoping (which we'll mention in class):
let m = 100 in let addM = func(x, (x plus m)) in ((let m = 5 in <addM(3)>;) plus (let m = 4 in <addM(3)>;)) ; ;
evaluates to 15, not 206.
(35pts) N6: Implement static scope.
- You'll want to modify your function-structures so that they include one extra piece of information: the bindings in effect when the function was declared.
  Be sure to make a careful suite of test-cases, to test for various ways of capturing bindings in different closures. (For example, consider the lecture on ways of using let* to effectively implement objects, classes, and inheritance.)
- When evaluating a function-application, use the bindings in effect back when that function was declared, for its free variables.
- You should not be doing any substitution.
- To think about: Hmm, when we first parse our expression, we'll create function-expressions, but (since we're not eval'ing) we don't have any bindings right then. Only later, when we eval a function, will we actually know about any bindings (since that call to eval was given a list of bindings)….
  
  clarification: This means that a function won't quite evaluate to itself anymore — it'll evaluate to a struct that has the same parameter and body as the original (parsed) structure, but a different environment.
- Note that getting recursive functions to work is a bit tricky, since their environment (closure) needs to include themselves, but when you're evaling a func-expr you don't yet have its associated name, hmmm. Just after the func-expr is created then you'll want to go in and change its environment to include itself. The easiest way is to do this with mutation (something like set-func-expr-env! ⁹).
Be sure to make a copy of your N5 project files before starting N6.
Further extra-credit options (of varying difficulty):
- Add comments to the grammar and parser. Make sure they nest. It's up to you whether or not you store comments in the internal representation, or discard them entirely ¹⁰.
- Re-write the code for evaluating let so that it simply transforms it into a function-application, and evaluates that.
- Generalize functions to multiple arity, and/or generalize let so that it takes any number of id/value pairs. (Really this would be like scheme's letrec, since N4 allows recursion.)
- If you want, modify the let syntax: instead of let id = Expr in Expr;, you can use Id = Expr ; Expr ; . (Note that this still represent declaring a new variable, not mutating an existing one.) Your programs now look like procedural programs.
- Add mutation (i.e. assigning to Ids): Id ← Expr;. (If you want to use mutation in your underlying scheme implementation, you can use set-first! and set-struct-field!¹¹. You can also think about how you might try implementing mutation in N6 without actually using mutation in your underlying program.)
- Once we have assignment, add the equivalent of scheme's begin. (You might use “{” and “}” to delimit such a “block” of statements; you now have implemented all the procedural concepts of a Java or Python interpreter, including first-class functions! And as seen in lecture, you also could write superficial transformations to get most of an object system as well.)

¹ This is so we can just use our language's built-in number-parsing functions, without getting bogged down in tokening input. So scheme implementations will allow exactly those strings recognized by number?, (including +nan.0, -inf.0, and 2+3i). Similarly, if using Java, the semantics of N0's arithmetic will be similar to IEEE floating point arithmetic (rather than perfectly-correct arithmetic). ↩

² Because we don't need to check for bad inputs, it's fine to have your interpreter crash if b=0. If you prefer to "control" crash — creating a meaningful error message and calling error or throw yourself — you are also welcome to do that. ↩

³ Note that our different implementations are now varying by more than just precision of arithmetic: in a Java implementation, NaN is a Num, and in a scheme implementation it's an Id. We won't use any test cases involving such subtle differences. However, note how our choices in designing a new language are being influenced by the language we're trying to easily implement it in! This stems from the fact that a primary design constraint on N1 is that implementing an intepreter for N1 doesn't get bogged down in minutae when using either Java or Racket. ↩

⁴ By the way: you will receive no credit for doing substitution on a string -- all our code should work on the parse tree itself. String-substitution (like C pre-processor macros) can't be generalized to handle shadowed variables (scope) for N3, and is in general fraught with error. A local-variable construct which requires globally-unique names isn't very impressive! ↩

⁵ The notation “let x = 5 in (x plus 3); ⇒ 5 plus 3 ⇒ 8” is shorthand for

  eval(parse("let x = 5 in (x plus 3);"))
= eval(parse("(5 plus 3)"))
= eval(parse("8"))

Observe how we definitely don't write “"let x = 5 in (x plus 3)" = "(5 plus 3)" = 8” since the two strings are not .equals(·) to each other, and strings are never ints. More specifically: we distinguish between “⇒” (“code evaluates to”) and “=” (“equals”, just as “=” has meant since kindergarten). ↩

⁶ nor any “binding occurrences”: The first x in let x = 5 in (x plus 3); is a binding occurrence, and the second x is a bound occurrence. (We say that “a variable is bound inside the scope of its binding occurrence”.) ↩

⁷ A third issue, pointed out in Programming Languages and Interpretation, is that evaluating deeply nested lets is an O(n²) algorithm. ↩

⁸ Note that the list/map you recur with has the additional binding, but that recursive call shouldn't add/modify the list/map used at the top level. Since java.util.Map is inherently mutable, you'll want to make a new copy of that map before recurring. ↩

⁹You'll have to use the keyword #:mutable when you define-struct that structure, to get the setter. ↩

¹⁰ You can think about which implementation you prefer, when using a language: when seeing the program's internal representation of your code (debugging, when debugging or seeing the compiled ↩

¹¹ Be default, scheme structs are non-mutable; to make them mutable, you have to provide the keyword #:mutable to define-struct. ↩

home—lectures—exams—hws—breeze (snow day)

©2012, Ian Barland, Radford University
Last modified 2012.Dec.06 (Thu) Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu

hw07 Interpreting Nproject

The language N0

hw07
Interpreting N
project