cs380 environments and closures

Due: Mar.03 (Sat) 23:59
Submit: H6.rkt and H6-test.rkt (or, H6-java/*.java) on D2L . Include prolog-code in a block comment near the start of your file, and (perhaps only) the changed code from H4 (tagged ;>>>H5 and ;>>>H6). You may use parts/all of H4-soln.

Prolog

H5: Environments

We continue to build on the H language implementation from previous homeworks (H4-soln). You may implement this homework in either Java or Racket (or another language, if you've cleared it with me). Copy your H0-H4 file/project to a new H5¹. Label each section of lines you change with a comment “;>>>H5” or “;>>>H6”. You don't need to turn in any hardcopy of unchanged-code (but do submit a fully-working copy in the drop-box, including all necessary files).

Shortcomings of substitution

There are two problems ² with the substitution approach used in H2–H4: (i) we fundamentally can't create recursive functions, and (ii) it’s hopeless should we want to add assigment to our language. Less importantly, you might also have thought it's a bit inefficient (by a factor of two), to do a substitution on an entire sub-tree, and then immediately re-walk through that same subtree then eval it. Can't we do those substitutions while we eval, “just in time”? We solve these problems with deferred substitution: Rather than substituting, we’ll just remember what substitutions we want made, and if we ever encounter an identifier then we look it up in our set-of-deferred-substitutions — our environment. So now we can both evaluate add y to 3 with an environment where y is bound to 7, and also evaluate add y to 3 with an environment where y is bound to 99

A Problem: static vs. dynamic scope

This H5 algorithm has improved on H4: we can now at least hope to later handle recursive functions and re-assigning to variables (challenge-credit).

But it’s also worse, because H5 now fails on a few expressions that H4 got correct! For example,

More generally, we find that H5’s eval is now giving us a different notion of binding, known as dynamic scoping:

We say that H5 is using dynamic scope: the use of a variable (here, m) will refer to its most recent run-time definition! If m is free within a function addM, in dynamic scope you can't (statically) tell where its binding occurence is: are we inside ((let ([m 5]) …)? Or are we inside let ([m 100]) …)?). In general, a function far far way might introduce its own local m, and then call addM; the function addM will use that far-distant, “local” m!⁵. Dynamic scope has its limited place, but in general it is not how we want most of our variables scoped.

A solution: closures

We want our usual static-scoping, where when you see a variable in the source-code, you can tell where its binding occurrence is. And we want to keep our notion of environments (rather than substitution), so that we can have recursive functions. So what do we do with a function that has m free in its body, and we want that to mean the m that is in the environment when the function is defined, not some possibly-other m that is in the environment when the function is called? We just need the function to remember what environment was being used when the function is defined, and use that environment for when we're evaluating the body! We call that environment the function's closure. This gives the effect you probably expected all along without thinking about it.

H6: Function + Environment = Closure

One thing to note: Our func-exprs occur statically, at compile time (after calling parse). On the other hand, closures exist only at run-time (when calling eval).

¹ Presumably you do this for each new homework, but it’s a particularly good idea for this one, since H5 is not just H4 plus new code, but instead replaces some of the H4 code (subst) with a different approach. ↩

² A third issue, pointed out in Programming Languages and Interpretation, is that evaluating deeply nested lets is an O(n²) algorithm. ↩

³ In DrRacket: Click Check Syntax and then right-click on the definition of eval, and choose Rename eval. ↩

⁴ Note that the list/map you recur with has the additional binding, but that recursive call shouldn't add/modify the list/map used at the top level. If using Java, java.util.Map is inherently mutable, so you’ll want to make a new copy of that Map when recurring. ↩

⁵ Early Lisp (1965) included dynamic scope; this was soon recognized as less-than-desirable, and keywords for creating statically-scoped variables were introduced. Similarly the first versions of Clojure’s (2007) var were dynamic with a keyword to make them statically-scoped; this was changed in 2010 to be static-by-default (with a keyword provided to create dynamically-scoped variables). Here’s a (Clojure-oriented) blog post Perils of Dynamic Scope. ↩

⁶

Hmm, should closures really be considered an expr? I mean, a closure is not something in our grammar. However, it is a type of value, and it's something that may get eval'd or even ->stringd. So it is definitely a hack, to consider a closure as a type of value, and all values as expr?s. So be aware: Our type expr? is no longer exactly corresponding to our grammar's non-terminal <Expr>!

An alternate solution would to make a new type (or/c expr? closure?), and use that as appropriate in our signatures as appropriate.

↩

⁷ format is like printf and ~v is the format-specifier for internal-representation (like python's repr). Or even shorter, the built-in function ~v takes any one value and returns it as a string. ↩

⁸ Preclude is a strong word; it turns out it is actually possible to define recursive functions w/o even naming them!! See the lambda calculus ↩

⁹ “What, racket uses references-to-structs instead of actual structs? Why didn't you tell us that earlier, Barland?” Because in the absence of mutation, no program’s results can distinguish between having an object vs a reference-to-object-which-is-always-dereferenced. So it was like a religious-opinion difference, until getting mutation in Advanced Student. ↩

¹⁰ You can think about which implementation you prefer, when using a language: when seeing the program’s internal representation of your code (when debugging, or seeing the compiled result of your code). ↩

environments and closures