RU beehive logo promo banner for Computing & Info Sciences
CS 380
2024fall
ibarland

environments and closures

Due: Dec.06 (Fri)07 (Sat) 23:59 08 (Sun) 23:59; a 10% bonus if submitted by Dec.07 (Sat) 23:59
Submit: OG6.rkt and OG6-test.rkt (or, OG6-java/*.java) on D2L . Include prolog-code in a block comment near the start of your file, and (perhaps only) the changed code from OG4 (tagged ;>>>OG5 and ;>>>OG6). You may use parts/all of OG4-soln.


Prolog

  1. Prolog lists Write the following Prolog predicates. Do not use append. For full credit, give idiomatic Prolog (no singleton variables, and no = on the right-hand-side).

    1. last(List, Item), which succeeds exactly when Item is the last item in List. This rule should fail if List is empty, of course. (This happens to be the book’s Chpt.16, programming exercise ~6.)
    2. nextToLast(List, Item) which succeeds exactly when Item is the next-to-last item in List. (This rule should fail if List has fewer than two items, of course.)
    3. lastTwoReversed(List, ListOf2) which succeeds exactly when ListOf2 contains the last and the second-to-last item of List (in that order, and nothing else).
    4. reverseLastTwo(List, NewList) succeeds exactly when NewList is exactly the same as List except that the last two items have been reversed. (This rule will fail if either input has fewer than two items.)
    Some examples follow. All of the predicates fail if the first argument is not a list.
    last([1,2,3], 3). true. last([1,2,3], 4). false. last([1,2,3], Y). Y=3. last([], Y). false. last(Y, 3). Y=[3], …
    lastTwoReversed([1,2,3], Y). Y=[3,2]. lastTwoReversed([1], Y). false.
    nextToLast([1,2,3], 2). true. nextToLast([1,2,3], 3). false. nextToLast([1,2,3], Y). Y=2. nextToLast([1], Y). false. nextToLast(Y, 3). Y=[3, _h114], % does not have to be 114, 'course. Y=[_h116, 3, _h114], …
    reverseLastTwo([1,2,3,4], Y). Y=[1,2,4,3]. reverseLastTwo([1,2], Y). Y=[2,1]. reverseLastTwo([1], Y). false.

    Note that the Prolog standard library contains several list functions which you are not to use for this assignment (e.g. append and reverse). Also, for full credit, don’t merely reverse the input list and operate on that result.


OG5: Environments

We continue to build on the OG language implementation from previous homeworks (OG4-soln). You may implement this homework in either Java or Racket (or another language, if you've cleared it with me). Copy your OG0-OG4 file/project to a new OG51. Label each section of lines you change with a comment “;>>>OG5” or “;>>>OG6”. You don't need to turn in any hardcopy of unchanged-code (but do submit a fully-working copy in the drop-box, including all necessary files).

Shortcomings of substitution

There are two problems2 with the substitution approach used in OG2–OG4: (i) we fundamentally can't create recursive functions, and (ii) it’s hopeless should we want to add assigment to our language. Less importantly, you might also have thought it's a bit inefficient (by a factor of two), to do a substitution on an entire sub-tree, and then immediately re-walk through that same subtree then eval it. Can't we do those substitutions while we eval, “just in time”? We solve these problems with deferred substitution: Rather than substituting, we’ll just remember what substitutions we want made, and if we ever encounter an identifier then we look it up in our set-of-deferred-substitutions — our environment. So now we can both evaluate finna rizz y 3 with an environment where y is bound to 7, and also evaluate finna rizz y 3 with an environment where y is bound to 99

  1. OG5 : This problem and the next are really the crux of the entire project.
    Deferred evaluation: OG5 doesn't add any new syntax, but it is a different evaluation model which will give us more expressive semantics.

    Here are just a few cases you might want to test:


A Problem: static vs. dynamic scope

This OG5 algorithm has improved on OG4: we can now at least hope to later handle recursive functions and re-assigning to variables (challenge-credit).

But it’s also worse, because OG5 now fails on a few expressions that OG4 got correct! For example,

iykyk makeAdder irl iso n
                    wya  iso m wya finna rizz n m
wya   ohio ohio makeAdder swoop 3 swoop 4 
; the racket equivalent to the above OG5:
(let {[make-adder (lambda (m)
                     (lambda (n) (+ m n)))]}
  ((make-adder 4) 3))
gives an error unbound identifier: m if no substitution has been done. The problem is that, in OG5, calling ohio make-adder swoop 4 returns a function whose body still includes m and n, but lacks the fact that we’d like it’s m to be bound to 3. One approach might be to have eval return both a value and an environment to use that value with. Upcoming solution: We’ll solve the problem in OG6 with a slightly different approach, storing the environment-to-use inside the function-representation.

More generally, we find that OG5’s eval is now giving us a different notion of binding, known as dynamic scoping:

 iykyk m irl 100
        wya iykyk returnM irl iso x wya m
                        wya iykyk m irl 5
                                  wya ohio returnM swoop 3  

; the racket equivalent of the above OG5:
(let {[m 100]}
   (let {[returnM (lambda(x) m)]}
      (let {[m 5]}
         (returnM 3))))
which now evaluates to 5(!), not 100 as as it did in OG4, and as we want it to again in OG6.

Here's another example (w/ a slightly-less-trivial function):

iykyk m irl 100
       wya iykyk addM irl iso x wya finna rizz x m
                    wya finna   iykyk m irl 5 wya ohio addM swoop 3
                        rizz  iykyk m irl 4 wya ohio addM swoop 3

; The racket equivalent(?) of the above OG5 expr:
(let {[m 100]}
  (let {[addM (lambda (x) (+ x m))]}
    (+ (let {[m 5]} (addM 3))
       (let {[m 4]} (addM 3)))))
evaluates to 15=(3+5)+(3+4), not 206=(3+100)+(3+100). This is a problem with OG5, that will need to redress in OG6, so we evaluate to 206 again.

We say that OG5 is using dynamic scope: the use of a variable (here, m) will refer to its most recent run-time definition! If m is free within a function addM, in dynamic scope you can't (statically) tell where its binding occurence is: are we inside ((let ([m 5]) …)? Or are we inside let ([m 100]) …)?). In general, a function far far way might introduce its own local m, and then call addM; the function addM will use that far-distant, “local” m!5. Dynamic scope has its limited place, but in general it is not how we want most of our variables scoped.

A solution: closures

We want our usual static-scoping, where when you see a variable in the source-code, you can tell where its binding occurrence is. And we want to keep our notion of environments (rather than substitution), so that we can have recursive functions. So what do we do with a function that has m free in its body, and we want that to mean the m that is in the environment when the function is defined, not some possibly-other m that is in the environment when the function is called? We just need the function to remember what environment was being used when the function is defined, and use that environment for when we're evaluating the body! We call that environment the function's closure. This gives the effect you probably expected all along without thinking about it.

Still, beware mutation: Note that even static-scope can give surprising results, if we have one variable shared by multiple closures, and then different functions mutate it. First inspect the code of the following Javascript demo, then run it:


And in Java, the following won't even compile:
for (int i=0; i<10; ++i)
    new Thread( () -> System.out.println("in thread #"+i) ).run(); 
It gives you an error message saying mutation and shared state is bad for your health (I'm paraphrasing).

Upshot: We’ll make OG6, to reclaim static scope, and get what we expect!


OG6: Function + Environment = Closure

A closure is a function plus the environment it was defined in.

One thing to note: Our func-exprs occur statically, at compile time (after calling parse). On the other hand, closures exist only at run-time (when calling eval).

    1. Be sure to backup your working OG5, before proceeding.
    2. Make test-cases for the above problems, with the expected-outcome using static scoping (that is, the same as racket returns). The above two test-cases will suffice,or you can think up more.
    3. Add a structure to represent closures: a function's parameter+body plus the environment it was eval'd in.
      (define-struct closure  ([func-expr? f] [env? e]))
    4. Let's update some type-definitions:
      • Closures are values (and can be returned by eval).
      • But mere func-exprs, while still a type of expr?, are not values any more (since they can't ever be returned by eval).
      • I advise also making closures be a type of expr6. As we update expr, you might want to add stub/dummy branches right now to your cond-statements for functions/templates handling an expr, so you don't accidentally forget about them later. If using Java a switch-expression pattern matching a sealed interface, then you don't need to do this; they compiler will remind you!
    5. To think about: Hmm, when we first parse our expression, we create function-expressions. But (since we're not eval'ing) we don't have any bindings right then. It's only later, if/when we eval-with-env a function-expression, that we will create closure. So functions no longer evaluate to themselves. However, a closure will evaluate to itself.

    6. Updating parse: parse never returns closures, so we won't modify that function at all (which makes sense, since we're not changing the grammar). Remember, closures only exist at run/eval-time, not compile/parse-time.
    7. Updating expr->string:

      Note that closures only exist while a program is running (being eval'd), and expr->string is not usually called while a program runs. So if you want, I guess you could have expr->string just throw (error: 'expr->string "Internal error: I didn't think anybody would want to print a closure."). But for defensive programming, a super easy thing to do is to have a closure just print the same as its enclosed-function does.

      And in fact there actually is a place where we might want to call expr->string from eval: if you encounter a run-time error, and want to print the offending expression as part of the error-message your interpreter creates.

      Having just one check-expects for this is plausible, since it is just calling already-tested code.

      Myself, I went ahead and had closures also print with their environment. Since that environment isn't something I needed to match something in our OG6 grammar, I used racket's built-in string-formatting, which works for all types (incl. structs and lists): (format "the env: ~v" my-env)7 is all I needed to append on.

    8. Updating subst? Don't -- we got rid of that in OG5!
    9. At this point, your code should run again (but fail the two new tests).

    10. Update eval-with-env part I: When eval-with-env encounters a function-definition, instead of just returning it, instead return a closure holding that function and the current-environment.
    11. Update eval-with-env part II: When eval-with-env encounters a closure, it just returns it (since closures are values).
    12. Update eval-with-env part III: Finally update eval-with-env so that when it encounters a function-application. When we eval function-being-called, we're now getting back a closure (not a func-expr). So evaluate the closure's function's body, but using the closure's environment instead of the current-environment (i.e., the environment in effect back when that function was declared) for any free variables in its body. Of course, you also still need to include the parameter's binding, just like you did in OG5.
      Note: Recall how eval'ing a func-apply was a lot like eval'ing a let-expr, except that the pieces we needed were distributed across two structs: the func-apply contained both the application's-argument and function (which in turn contained its param and body). This is reminiscent of that, except where we used to have an immediate func-expr, we now have a closure-structure which contains that func-expr and the environment we need.
      I highly recommend: Sketch a concrete data-structure example, to get this right! Then turn your sketch into a check-expect.

  1. Challenge/Extra Credit

    Everything below here is extra-credit, after completing the above. I do recommend reading it over as food-for-thought, even if you don't plan on implementing any of it.

    Challenge/extra-credit Our scoping-rules for iykyk, which are the same as lisp/scheme/racket's let, don't include the variable's own right-hand-side/initializer to be part of the scope, understandably. But that would preclude8 being able to write recursive functions. In lisp/scheme/racket, the let-variant letrec is the version with the scoping rule to allow defining recursive functions.

    Getting recursive functions to work is a bit tricky: their environment (closure) needs to include its own name! That is, we’ll have eventually end up with a function-struct whose closure-field is a list containing the function-struct. That’s not a problem in racket, no more than it is in Java -- racket struct values are actually references-to-structs, just like in Java9. However, it is a place where you might want to use mutation (read on).

    The tricky bit is that when you're evaling a func-expr you don't yet have its associated name, hmmm. The name only gets established outside the function-definition, in the surrounding OG6 LetExpr being eval’d. So after the let-statement has finished evaling its “right-hand-side” value (which turns out to be a function, including its closure), then you’ll want to further reach in and modify that closure to include one additional ID/value pair. You can, if you like, use the struct-setter (something like “set-func-closure!”); see mutation in racket below.

    The “need” for mutation comes from the cyclical data-dependency: a function-struct contains an environment which refers to that function-struct. Using the shared form, to create cyclical data, removes the need for you to do mutation, although internally it presumably uses mutation. But you can also easily avoid the cyclical dependency, by using a level of indirection: Just keep your function-structure as it was in OG4 (does not contain its closure as a field), but then make a function-with-env structure which has two fields -- the pure function-struct plus the environment to use when calling it; eval will return/use this function-with-env type.

    Further challenge/extra-credit options (of varying difficulty):
    1. Add block-comments to your language; Make sure comments nest. It’s up to you whether or not you store comments in the internal representation, or discard them entirely10. You might want modify your grammar, or consider a multi-phase approach to parsing.
    2. Re-write the code for evaluating let so that it simply transforms it into a function-application, and evaluates that.
    3. Generalize functions to multiple arity, and/or generalize let so that it takes any number of id/value pairs. (Really this would be like scheme’s letrec, since OG6 allows recursion.)
    4. If you want, modify the LetExpr syntax: instead of iykyk id irl Expr wya Expr, you can use Id = Expr ; Expr. (Note that this still represent declaring a new variable, not mutating an existing one.) Your programs now look like procedural programs. Note that parsing is more difficult: after reading an Id, you have to check for whether it’s followed by = or not. You should satisfy yourself that this grammar not ambiguous.)
    5. Add mutation (i.e. assigning to Ids): IdExpr;. See mutation in racket below.
    6. Once we have assignment, add the equivalent of scheme’s begin. (You might use “{” and “}” to delimit such a “block” of statements; you now have implemented all the procedural concepts of a Java or Python interpreter, including first-class functions! And as seen in lecture, you also could write superficial transformations to get most of an object system as well.)

    Mutation in Racket

    If you want to use mutation in your racket-implementation for the extra-credit, use Advanced Student language. This language level includes both: set! (to assign to variables), and set-struct-field! (to assign to struct fields). Note that these two actions are very different, which is why racket gives them separate names; in Java assigning-to-field and assigning-to-local-variable can look identical (syntactically), despite giving rise to very different behavior.

    Since the mutators return (void), and we still want(need) to return a (useful) value from every expression, we will use mutation inside a begin expression:

    (define times-called 0)
    (define (triplify-and-print n)
      (begin (printf "We use `begin` to call a function for a side-effect *and* return a value too.\n")
             (set! times-called (add1 times-called))
             (printf "This function has been called ~a time~a.\n"
                     times-called
                     (if (= times-called 1) "" "s"))
             (* 3 n)))
    
    (triplify-and-print 5)
    (triplify-and-print 2)

    Btw, it’s worth noting that in full-racket (as opposed to advanced-student), begin is implicit in almost all forms (e.g. function-bodies and cond-branches).


1 Presumably you do this for each new homework, but it’s a particularly good idea for this one, since OG5 is not just OG4 plus new code, but instead replaces some of the OG4 code (subst) with a different approach.      
2 A third issue, pointed out in Programming Languages and Interpretation, is that evaluating deeply nested lets is an O(n2) algorithm.      
3 In DrRacket: Click Check Syntax icon: check-mark with magnifying glass and then right-click on the definition of eval, and choose Rename eval.      
4 Note that the list/map you recur with has the additional binding, but that recursive call shouldn't add/modify the list/map used at the top level. If using Java, java.util.Map is inherently mutable, so you’ll want to make a new copy of that Map when recurring.      
5 Early Lisp (1965) included dynamic scope; this was soon recognized as less-than-desirable, and keywords for creating statically-scoped variables were introduced. Similarly the first versions of Clojure’s (2007) var were dynamic with a keyword to make them statically-scoped; this was changed in 2010 to be static-by-default (with a keyword provided to create dynamically-scoped variables). Here’s a (Clojure-oriented) blog post “Perils of Dynamic Scope”.      
6

Hmm, should closures really be considered an expr? I mean, a closure is not something in our grammar. However, it is a type of value, and it's something that may get eval'd or even ->stringd. So it is definitely a hack, to consider a closure as a type of value, and all values as expr?s. So be aware: Our type expr? is no longer exactly corresponding to our grammar's non-terminal <Expr>!

An alternate solution would to make a new type (or/c expr? closure?), and use that as appropriate in our signatures as appropriate.

     
7 format is like printf and ~v is the format-specifier for internal-representation (like python's repr). Or even shorter, the built-in function ~v takes any one value and returns it as a string.      
8 Preclude is a strong word; it turns out it is actually possible to define recursive functions w/o even naming them!! See the lambda calculus      
9 “What, racket uses references-to-structs instead of actual structs? Why didn't you tell us that earlier, Barland?” Because in the absence of mutation, no program’s results can distinguish between having an object vs a reference-to-object-which-is-always-dereferenced. So it was like a religious-opinion difference, until getting mutation in Advanced Student.      
10 You can think about which implementation you prefer, when using a language: when seeing the program’s internal representation of your code (when debugging, or seeing the compiled result of your code).      

logo for creative commons by-attribution license
This page licensed CC-BY 4.0 Ian Barland
Page last generated
Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu
Rendered by Racket.