itec380 Expr Trees

Expr Trees
eval/parse OG basics; adding identifiers

Due 2024-Nov-19, at start of class

Note that OG1 is “add code that is extremely similar to what already exists”, but OG2 is not at all rote: it requires a solid understanding of the parse-tree representation and how the functions work.

Over the course of several homeworks, we'll implement a “tower” of languages (OG0, OG1, …, OG6) each language incorporating more features than the previous. OG0 is provided for you.

Download & get the provided OG0 running (both .rkt and .java).

Each time we add new syntax to our language, we must update three methods:

parse!, which turns source-code into an internal representation,
expr->string which turns internal representation back into source-code,
eval, which actually interprets the program, returning a value (a number for OG0-OG2, and a number-or-function in OG3).

I recommend implementing and testing the methods in that order. See the provided test-cases, for examples. There might be other helper functions to implement as well.

Your solution may be in Java or in Racket (or see me, if you want to use a different language). (The exception is OG1, which you must implement in both Racket and Java.)

You do not need to worry about error-checking the programs we process — we will only concern ourselves with syntactically correct OGi programs.
(As a matter of defensive programming, you might still want to add some assertions/sanity-checks in your code.)

Use the D2L discussion board for group questions and answers.

This project is based, in part, on the first chapters of Programming Languages and Interpretation, by Shriram Krishnamurthi. As a result, this homework assignment is covered by the Creative Commons Attribution-NonCommercial-ShareAlike License. Although we're using a different dialect of scheme/racket than that book, you might find it helpful to skim it.

Deliverables:

If adding to an existing file, add a comment “;>>> OG1” or “;>>> OG2” next to the area you are changing; when grading I will search for “>>>”. This helps immensely, thanks.
Since OG2 subsumes OG1, you only need to submit OG1 in one language, and OG2 in the other.
For the D2L dropbox, please submit your files individually (no .zip/.jar). Either:
- OG2{,-test}.rkt, and java files with an OG1 implementation ({Expr,ExprTest,BinOp,IfGE}.java)
- OG1{-test}.rkt, and java files with an OG2 implementation ({Expr,ExprTest,BinOp,IfGE,Id,Let}.java)
(You should use those filenames, except that your new Java classes can be whatever (reasonable) name you chose.)
You do not need to include any provided files that are unchanged:
- student-extras.rkt, scanner.rkt;
- UtilIan.java, Pair.java;
- {IfZero,Num,Value,Paren}.java

For hardcopy, also turn in only the modified/new files.

However, if you want to save printing costs, you are welcome to submit only the parts of the file that changed (including test cases). For example, you might only print snippets like:

;;; added to the main `cond` for `parse!`:
     [(ifGE? e) your code here]   ;>>>OG2


;;; added to OG2-tests.rkt:
(define all-tests {
   ...
   ["vibe check 2 low-key 3  ong 4 bet 5" 5]
   ["vibe check 2 low-key 0  ong 4 bet 5" 4]
   ...
   })

If doing so, be sure your preserve indentation, thanks! Besides, misleading/poor indentation loses points. I actually prefer getting only the changed gists, but I'm certainly not going to require everybody to manually go over their code to copy/paste just-the-changes!

You should not (have your IDE) add any package statements.

Implement OG1 in racket (25pts) and in Java (25pts), both.
OG1 is just like OG0, but with two additions:

<Op>      → … | cancel                                                  Interpretation: “remainder¹”
<Expr>    → … | <IfGE>                                                  Interpretation: “if greater-or-equal”
<IfGE>    → vibe check <Expr> low-key <Expr>  ong <Expr> bet <Expr>  Interpretation: if 1st Expr is >= 2nd, result is the 3rd Expr else the 4th.

Be sure to write test cases first; To ensure everybody makes test cases to cover the basics, I've spelled out these strongly-suggested OG2-initial-tests.html.

Add the cancel operator.
1. update parse! (Java: Expr.parse) to handle this new operator (after writing test cases).
  (Or, if you don't need to update this function, understand why.)
2. update expr->string (Java: Expr.toString) to handle this new operator (after writing test cases).
  (Or, if you don't need to update this function, understand why.)
3. update eval to handle these new operators.
  - The semantics of finna cancel x y is:
    the remainder after dividing x by y, where the result is always between 0 (inclusive) and y (exclusive)²
    In particular, the result should never be positive if y<0. Notice that this is slightly different behavior than either Java's built-in % (which behaves differently on negative numbers), and from Racket's built-in modulo (which only accepts integers). In both racket and Java, you can calculate this as y * (x/y - ⌊x/y⌋), where ⌊r⌋ means the the floor of r.
  Note that you are already provided sufficient test cases for cancels, in the comments of the OG0 test-case files.
Add the ifGE expression.
1. update parse! (Java: Expr.parse) to handle this new type of expression (after writing test cases).
2. update expr->string (Java: Expr.toString) to handle this new type of expression (after writing test cases).
3. update eval to handle this new type of expression. The semantics of vibe check Expr₁ low-key Expr₂ ong Expr₃ bet Expr₄ is:
  first evaluate just Expr₁ and Expr₂ to get values v₁; and v₂; if v₁ is greater-or-equal to (the presumably-low-key) v₂ then evaluate Expr₃ and return its value; otherwise evaluate Expr₄ and return that value.
  For example, vibe check 7.5 low-key 44 ong 3.4 bet 2.1 evaluates to 3.4 2.1.
  (Note how the non-returned Expr never gets evaluated: you are implementing short-circuit semantics for IfGEs!)
  
  You must make your own test cases for IfGEs; include at least two as-simple-as-possible tests, and two tests with more deeply nested Exprs. I suggest including one where the IfGE is not the top-level expression (e.g., a rizz expression which contains a IfGE as one of its operands).

(50pts) Implement OG2 in either racket or Java (your choice).
OG2 adds identifiers to OG1:

<Expr>    → … | <Id> | <LetExpr>                Interpretation: identifier; let
<LetExpr> → iykyk <Id> irl <Expr> wya <Expr>³   Interpretation: bind Id to result of 1st Expr (the right-hand-side); then eval 2nd body Expr w/ that binding

where <Id> can be pretty much any string: it can be any token returned by our scanner. This simplistic ⁴ definition lets us keep our code easy: if we get a token (via peek/UtilIan#hasNextSplittingBy), and it's not the start of any other expression, then it's an identifier! We will assume that (a) an Id is never one of our reserved words like vibe check so you do not need to bother checking for that, and (b) for now any nested LetExpr expressions will use different Ids. We'll handle shadowing in OG3, later.)

Update your three methods parse, toString (a.k.a. expr->string), eval.

add Ids to your data-definition (after deciding what data-type will represent them); then:
1. update expr->string (or, Expr.toString) to handle this new type of expression (after test cases)
2. update parse! (or, Expr.parse) to handle this new type of expression (after test cases)
3. update eval (or, Expr.eval) to handle this new type of expression:
  eval'ing an identifier simply throws an error.
  
  You don't need test cases for evaling Ids. Though if you want to be spiffy, you can use check-error in racket, or assertThrows in JUnit5.
Next, add LetExprs to your data-definition (after deciding what data-type will represent them); then:
1. update expr->string (or, Expr.toString) to handle this new type of expression (after test cases)
2. update parse! (or, Expr.parse) to handle this new type of expression (after test cases). Since our parser splits on punctuation, < will be read as one token, and - as second (both racket and Java).
3. Think about how to update eval to handle this new type of expression. Now we get to the heart of the issue! Write test cases, after reading the rest of this bullet.
  In order to write eval, we need to define the semantics of iykyk Id irl Expr₀ wya Expr₁:
  - Evaluate Expr₀; let's call the result v₀.
  - Then, substitute v₀ for all occurrences of Id inside the tree Expr₁; name the result of the substitution E′.
  - Evaluate E′, and return that result.
  For example: iykyk x irl 3 wya finna rizz x 2 ⇒ finna rizz 3 2 ⇒ 5⁵. Be sure to write test cases for your substitution function before you write its code; include several trivial and easy tests, along with a couple of more complicated nestings as well as one deeply nested expression.
  
  Observe that when evaluating a (legal) OG2 program, eval will never actually encounter an Id — that Id will have been substituted out before we ever recur down to it.
4. Now that we realize that eval will need to do substitution in a tree, and that's a smaller, simpler, self-contained task — perfect for its own helper-function substitute. This function only does substition in a tree, and does not attempt to do any evaluating.
  Go and write substitute (after test cases for it), before implementing eval for LetExprs. (And when starting substitute, start from the template for Exprs.) For the test cases, think about exactly types you'll be wanting to sent to substitute. Your simplest test-cases won't even contain a LetExpr — substitute is a function whose purpose-statement stands on its own, entirely independent of LetExprs and eval!
  Hint: Substituting a variable with a value in an syntax-tree is essentially the same as replacing every occurrence of one name with another in an anc-tree. (The only difference is that an anc-tree had only two cond-branches, while Expr has around five, though the code for most of those are very similar.)
  
  (Note: you must do substitution in the parse tree; no credit given for string-substitution ⁶.)
5. Finally, with the substitute helper written, we're ready: write eval for LetExprs.
  Hint: Your code will correspond almost word-for-word to the semantics given above.

¹ Like remainder, except working for negative and fractional amounts. See OG2.html for details. ↩

² Because we don't need to check for bad inputs, it's fine to have your interpreter crash if y=0. If you prefer to "control" crash — creating a meaningful error message and calling error or throw yourself — you are also welcome to do that. ↩

³ For comparison, here is what comparable constructs look like in other languages:

ML-like:	let x = 2+3 in x*9 end;
lisp-like:	(let {[x (+ 2 3)]} (* x 9))
lisp-like, simplified:	(let x (+ 2 3) (* x 9))
C#-like:	using (var x = 2+3) { return x*9; }
javascript-like:	var x = 2+3; return x*9;
Java-like:	{ int x = 2+3; return x*9; }
Haskell-like:	* x 9 \n where x = + 2 3 \n

Another option for the assignment-character is “:=” (Ada,Pascal), or “←” (indicating which way the data flows), or even something like “Expr → Id { … }” (which might make CS1 students happier — the processing happens left-to-right, just like we read the statement).

Note that you can (and should) test and write a “substitute” function w/o worrying about the exact syntax of a LetExpr. Substituting one thing in a tree for another is its own independent task, de-coupled from eval’ing a local-binding statement.

The relevent xkcd.

↩

⁴ Note that the different scanner(tokenizer) implementations now mean that our definition of the language is vague: in a Java implementation, NaN is a Num, and in a racket implementation it's an Id. We won't use any test cases involving such subtle differences. However, note how our choices in designing a new language are being influenced by the language we're trying to easily implement it in! This stems from the fact that a primary design constraint on OG2 is that implementing an intepreter for OG2 doesn't get bogged down in minutae when using either Java or Racket. ↩

⁵ The notation “iykyk x irl 3 wya finna rizz x 2 ⇒ finna rizz 3 2 ⇒ 5” is shorthand for:

  eval(parse!("iykyk x irl 3 wya finna rizz x 2"))
= eval(parse!("finna rizz 3 2"))
= eval(parse!("5"))

Observe how we definitely don't write “iykyk x irl 3 wya finna rizz x 2 = finna rizz 3 2 = 5” since those strings are not .equals(·) to each other, and strings are never equal to ints. More specifically: we distinguish between “⇒” (“code evaluates to”) and “=” (“equals”, just as “=” has meant since kindergarten). ↩

⁶ All our real code should work on the parse tree itself. String-substitution (like C pre-processor macros) can't be generalized to handle shadowed variables (scope) for OG3, and is in general fraught with error. Even without shadowing: what if a OG2 programmer uses a local variable “x” bound to 5, and then later creates a different local “x” bound to -7? You must be able to handle that situation! A local-variable construct which requires globally-unique names is notably less useful. ↩

This page licensed CC-BY 4.0 Ian Barland
Page last generated

Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu

Expr Trees eval/parse OG basics; adding identifiers

Expr Trees
eval/parse OG basics; adding identifiers