ITEC380 U6: environments

U6: environments

Due May.04 (Sat) 14:00~~.02 (Thu)~~
Submit: U6.rkt and U6-test.rkt (or, U6-java.jar) on D2L, and hardcopy at Tuesday's exam, with the changed code from U4 (tagged “;>>>U5” and “;>>>U6”). Include prolog-code in a block comment near the start of your file.

(10pts) Prolog lists

Write the following Prolog predicates. Do not use append. For full credit, give idiomatic Prolog (no singleton variables, and no = on the right-hand-side).

(2pts) last(List, Item), which succeeds exactly when Item is the last item in List. This rule should fail if List is empty, of course. (This happens to be the book’s Chpt.16, programming exercise #6.)
(3pts) nextToLast(List, Item) which succeeds exactly when Item is the next-to-last item in List. (This rule should fail if List has fewer than two items, of course.)
(2pts) lastTwoReversed(List, ListOf2) which succeeds exactly when ListOf2 contains the last and the second-to-last item of List (in that order, and nothing else).
(3pts) reverseLastTwo(List, NewList) succeeds exactly when NewList is exactly the same as List except that the last two items have been reversed. (This rule will fail if either input has fewer than two items.)

All of the predicates fail if the first argument is not a list. Some examples (test cases) are provided, below.

Note that the Prolog standard library contains several list functions which you are NOT to use for this assignment (e.g. append and reverse). Also, for full credit, don’t merely reverse the input list and operate on that result.

We continue to build on the U language implementation from previous homeworks (U4 sol'n posted). You may implement this homework in either Java or Racket (or another language, if you clear it with me). Copy your U0-U4 file/project to a new U5¹. Label each section of lines you change with a comment “;>>>U5” or “;>>>U6”. You don't need to turn in any hardcopy of unchanged-code (but do submit a fully-working copy in the drop-box, including all necessary files).

U5 (25pts; This problem and the next are really the crux of the project.)
Deferred evaluation: U5 doesn't add any new syntax, but it is a different evaluation model which will give us more expressive semantics.

There are two problems ² with the substitution approach used in U2–U4: (i) we fundamentally can't create recursive functions, and (ii) it’s hopeless for ever assigning to variables. Less importantly, you might also think "it's a bit inefficient (by a factor of two), to do a substitution on an entire sub-tree, and then immediately re-walk through that same subtree then eval it. Can't we do those substitutions while we eval, “just in time”? We solve these problems with deferred substitution: Rather than substituting, we’ll just remember what substitutions we want made, and if we ever encounter an identifier then we look it up in our set-of-deferred-substitutions — our environment. So now we might say “evaluate [y add 3] with an environment where y is bound to 7”.

When interpreting a program, we'll use a new function eval-with-env will take two arguments: the program to interpret, and any set of (pre)existing bindings.
A binding is just an Id and its associated value; a set of bindings (an environment) might be implemented by using association list (as seen on exam01) or a java.util.Map<IdExpr,ValExpr>.
To do this, we shall:
1. re-name ³ the function eval as eval-with-env, which takes an environment as a second input.
2. Then, go back and re-create the function eval as a function which still takes just one input, and simply calls eval-with-env, passing it an empty-environment.
3. Be sure you udpate all the recursive calls inside eval-with-env so that they pass a second argument!
This way, all your existing tests to eval can be unchanged, but you can add some unit-tests for eval-with-env to help figure out what that function needs to do with its environment.
search-replace all occurrences of eval in your tests to be instead calling “eval-helper”. Make that a function which simply calls eval with an empty-environment. Then, go change eval to have a second input, and add some two-argument test-cases as just mentioned.
Upon encountering an Id, eval-with-env will just look up its value in our environment (list-of-bindings). (Recall that in U2, eval never actually reached an Id; in U5 it now does.)
In this approach, eval-with-env’ing FuncApplyExprs and LetExprs no longer do any substitution — instead, each introduce (exactly one) new binding and proceed recursively ⁴.

Your test cases should include a recursive U5 function, as well as the addM example below. Here are just a few test cases you might want to test:

What does the U5 program [y add 3] evaluate to, with the environment where x is bound to 5 and y is bound to 7?
What does the U5 program let y get 5 for [y add 3] evaluate to, with the environment where x is bound to 5 and y is bound to 7?
What does the U5 program let y get [x mul 2] for [y add 3] evaluate to, with the environment where x is bound to 5 and y is bound to 7?

Discussing U5

A step sideways: This U5 algorithm has improved on U4: we can now have recursive functions. But it’s also worse, because it now fails on some expressions that U4 got correct! For example,
let make-adder get func m --> (func n --> [n add m])
for app app make-adder wth 3 wth 4
gives an error "unbound identifier: m" if no substitution has been done. The problem is that, in U5, calling call make-adder(3) returns a function whose body still includes m and n, but lacks the fact that we’d like it’s m to be bound to 3. One approach might be to have eval return both a value and an environment to use that value with. We’ll solve the problem in U6 with a slightly different approach, storing the environment-to-use inside the function-representation.

Note that U5’s eval now gives us dynamic scoping (which we’ll mention in class):

let m get 100 
for let addM get func x --> [x add m]
   for [let m get 5 for app addM wth 3
        add
        let m get 4 for app addM wth 3]

evaluates to 15, not 206 as we might expect (and, prefer).

In dynamic scope, the use of a free variable (here, m) will refer to its most recent run-time definition! If m is free within a function addM, you can't tell where its binding occurence is (let m get 5? let m get 100?). In general, a function far far way might introduce its own, local m and call addM; the function addM will use that far-distant, “local” m!⁵.

Note that even static-scope can give surprising results, if we have one variable shared by multiple closures, and then different functions mutate it. In Javascript we can run:
(Do a view-source, to see the Javascript.)
And in Java, the following won't even compile:

for (int i=0; i<10; ++i)
    new Thread( () -> System.out.println("in thread #"+i) ).run(); }

It gives you an error message saying "mutation and shared state is bad for you" (I'm paraphrasing).

Upshot: We’ll make U6, to reclaim static scope, and get what we expect!

(15pts) U6: Implement static scope (closures).

(10pts) Modify your function-structures so that they include one extra piece of information: the environment (bindings) in effect when the function was declared. This is the function’s closure. Update parse and toString.

Also make a careful suite of relevent test-cases involving functions and function-calls to test for various ways of capturing bindings in different closures. (On Thursday’s lecture, we'll see on ways of using let* to effectively implement objects, classes, and inheritance.)
(5pts) When evaluating a function-application, use the environment in effect back when that function was declared, for its free variables.
You should not be doing any substitution.
To think about: Hmm, when we first parse our expression, we’ll create function-expressions, but (since we're not eval'ing) we don't have any bindings right then. So, initially create it with an dummy environment (a sentinel value like #f).

Only later, when we eval-with-env a function, will we actually know about any bindings (since that call to eval-with-env was given a list of bindings)….
subtlety: This means that a function won't quite evaluate to itself anymore — it’ll evaluate to a struct that has the same parameter and body as the original (parsed) structure, but a fully fleshed-out, non-dummy closure.
(challenge/extra-credit: 10pts) Note that getting recursive functions to work is a bit tricky: their environment (closure) needs to include its own name! That is, we’ll have eventually end up with a function-struct whose closure-field is a list containing the function-struct. That’s not a problem in racket, no more than it is in Java -- racket struct values are actually references-to-structs, just like in Java ⁶. However, it ~~will require~~ is a place where you might want to use mutation (read on).

The tricky bit is that when you're evaling a func-expr you don't yet have its associated name, hmmm. So after the let-statement has finished evaling its “right-hand-side” value (which turns out to be a function, including its closure), then you’ll want to further reach in and modify that closure to include one additional ID/value pair. You can, if you like, use the struct-setter (something like “set-func-closure!”); see mutation in racket

The “need” for mutation comes from the cyclical data-dependency: a function-struct contains an environment which refers to that function-struct. Using shared function, to create cyclical data, removes the need for you to do mutation, although internally it presumably uses mutation. But you can also easily avoid the cyclical dependency: Just keep your function-structure as it was in U5 (does not contain its field), but then make a `function-with-env` structure which has two fields -- the pure function-struct plus the environment to use when calling it; eval will return/use this function-with-env type.

You shouldn't need any additional test cases for U6; the tests for U0-U5 should suffice, although one or two U5 examples depending on dynamic binding should now have a new expected-result.

Further extra-credit options (of varying difficulty):

Add comments to your language; Make sure comments nest. It’s up to you whether or not you store comments in the internal representation, or discard them entirely ⁷. You might want modify your grammar, or consider a multi-phase approach to parsing.
Re-write the code for evaluating let so that it simply transforms it into a function-application, and evaluates that.
Generalize functions to multiple arity, and/or generalize let so that it takes any number of id/value pairs. (Really this would be like scheme’s letrec, since U6 allows recursion.)
If you want, modify the LetExpr syntax: instead of let id get Expr for Expr, you can use Id = Expr ; Expr ; . (Note that this still represent declaring a new variable, not mutating an existing one.) Your programs now look like procedural programs. Note that parsing is more difficult: after reading an Id, you have to check for whether it’s followed by = or not. You should satisfy yourself that this grammar not ambiguous.)
Add mutation (i.e. assigning to Ids): Id ← Expr;. See mutation in racket below.
Once we have assignment, add the equivalent of scheme’s begin. (You might use “{” and “}” to delimit such a “block” of statements; you now have implemented all the procedural concepts of a Java or Python interpreter, including first-class functions! And as seen in lecture, you also could write superficial transformations to get most of an object system as well.)

Mutation in Racket

If you want to use mutation in your racket-implementation, use Advanced Student language. This language level includes both: set! (to assign to variables), and set-struct-field! (to assign to struct fields). Note that these two actions are very different, which is why racket gives them separate names; in Java assigning-to-field and assigning-to-local-variable can look identical (syntactically), despite giving rise to very different behavior.

Since the mutators return #void, and we still want(need) to return a (useful) value from every expression, we will use mutation inside a begin expression:

(define times-called 0)
(define (triplify-and-print n)
   (begin (printf "We use `begin` to call a function for a side-effect *and* return a value too.\n")
          (set! times-called (add1 times-called))
          (printf "This function has been called ~a time~a.\n"
                  times-called 
                  (if (= times-called 1) "" "s"))
          (* 3 n)))

(triplify-and-print 5)
(triplify-and-print 2)

Btw, it’s worth noting that in full-racket (as opposed to advanced-student), begin is implicit in almost all forms (e.g. function-bodies and cond-branches).

¹ Presumably you do this for each new homework, but it’s a particularly good idea for this one, since U5 is not just U4 plus new code, but instead replaces some of the U4 code (subst) with a different approach. ↩

² A third issue, pointed out in Programming Languages and Interpretation, is that evaluating deeply nested lets is an O(n) algorithm. ↩

³ In DrRacket: Click Check Syntax (), and then right-click on the definition of eval, and choose Rename eval. ↩

⁴ Note that the list/map you recur with has the additional binding, but that recursive call shouldn't add/modify the list/map used at the top level. If using Java, java.util.Map is inherently mutable, so you’ll want to make a new copy of that Map when recurring. ↩

⁵ Early Lisp (1965) included dynamic scope; this was soon recognized as less-than-desirable, and keywords for creating statically-scoped variables were introduced. Similarly the first versions of Clojure’s (2007) var were dynamic with a keyword to make them statically-scoped; this was changed in 2010 to be static-by-default (with a keyword provided to create dynamically-scoped variables). Here’s a (Clojure-oriented) blog post “Perils of Dynamic Scope”. ↩

⁶ “What, racket uses references-to-structs instead of actual structs? Why didn't you tell us that earlier, Barland?” Because in the absence of mutation, no program’s results can distinguish between having an object vs a reference-to-object-which-is-always-dereferenced. So it was like a religious-opinion difference, until getting mutation in Advanced Student. ↩

⁷ You can think about which implementation you prefer, when using a language: when seeing the program’s internal representation of your code (when debugging, or seeing the compiled result of your code). ↩

This page licensed CC-BY 4.0 Ian Barland
Page last generated

Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu