Over the course of several homeworks,
we'll implement a “tower” of languages (N0, N1, …, N6)
each language incorporating more features than the previous.
N0 is provided for you.
You'll implement N1 in both Racket and Java;
after that you choose which of the two languages you'll use
to complete N2, …, N6.
(10pts)
N1 is just like N0, but with two additional
types of expressions:
Expr ::= … | IsNegExpr
IsNegExpr ::= if Expr isNeg then Expr else Expr ;
BinOp ::= … | mod
|
Update
parse,
toString (a.k.a. expr->string),
and eval
appropriately,
for both the scheme and Java implementations.
Be sure to write test cases first.
The only method which cares what these new expressions mean
(their semantics)
is
eval:
if Expr0 isNeg then Expr1 else Expr2;
first evaluates just Expr0;
if that value is negative, then it evaluates Expr1
and returns its value;
otherwise it evaluates Expr2 and returns its value.
(Note how you are implementing short-circuit semantics for isNegExpr!)
(a mod b) should evaluate to2
a mod b,
where the result is always between 0 (inclusive) and b (exclusive);
in particular, the result is never positive if b<0.
Notice that this is slightly different behavior than
either Java's built-in %
(which behaves differently on negative numbers),
and from Racket's built-in modulo (which only accepts integers).
In Java, you can use
a%b (if b>0 or a%b=0)
and
a%b+b (otherwise).
In scheme or Java, you can use
b*(a/b-floor(a/b))
Note that you are provided sufficient test cases for modExprs,
except that you have to translate it proper N0.
You must make your own test cases for ifNegExprs;
include at least two as-simple-as-possible tests, and two tests with more deeply nested
Exprs.
-
(30pts)
N2 adds identifiers to N1:
Expr ::= … | Id | LetExpr
LetExpr ::= let Id = Expr in Expr;
|
where Id can be
any series of letters and digits which isn't interpretable as a number3.
(Assume for now that any nested let
expressions
use different Ids.
We'll handle that possibility in N3, later.)
Update your three methods
parse,
toString (a.k.a. expr->string),
eval.
We now need to define the semantics of
let Id be (E0) in (E1):
-
Evaluate E0,
calling the result v0.
-
Then, substitute
v0
for
all occurrences
of
Id
inside
E1;
name the result of the substitution
E′.
-
Evaluate E′,
and return that result.
Observe that when evaluating a (legal) N2 program,
eval will never actually encounter an Id --
that Id will have been substituted out before
we ever recur down to it.
The code to make a substitution in an Expr
parse-tree4
is similar to taking an Ancestor-tree,
and replacing every blue-eyed Child with a green-eyed one.
(If you want to write that AncTree function, it may
you understand what's needed for this one.
The only difference
is that an AncTree had only two cond-branches,
while Expr has around seven,
though the code for most of those are very similar).
For example:
let x = 5 in (x plus 3) ⇒ (5 plus 3) ⇒ 85.
Be sure to write test cases for your substitution function before
you write its code;
include several trivial and easy tests,
along with a couple of more complicated nestings
and one deeply nested expression.
-
(20pts)
N3 is just like N2, except we now
allow one variable to shadow another.
Consider an expression
let Id = (E0) in (E1)
where
E1 contains
some non-free (“bound”) occurrences of the same Id?
For example, we might have a let x = …
which in turn contains another let x = …
inside of it.
In that case the inner x should shadow the outer one:
let x = 3 in let x = 5 in (x plus 3);; ⇒
let x = 5 in (x plus 3); ⇒
(5 plus 3) ⇒
8
Thus, when substituting,
only substitute “free occurrences” of an Id
in E1,
not any “bound occurrences”6.
Think through exactly what you need to substitute,
in the following examples:
let y = 3 in let x = 5 in (x plus y);;
⇒
⇒
⇒
8
let y = 3 in let x = y in (x plus y);;
⇒
⇒
⇒
6
let x = 5 in let y = 3 in (let x = y in (x plus y); plus x);;
⇒
⇒
⇒
⇒
⇒
11
(You might find it helpful to draw the parse tree,
rather than staring at the flat string,
especially for this last example.)
Make test cases for
parse (optional),
toString and eval
involving
the N3 equivalents of the following scheme expressions:
(let {[x 5]} (let {[x 3]} x))
(let {[x 5]} (let {[x (+ x 1)]} x))
(let {[x 5]} (let {[y 3]} (+ (let {[x y]} (+ x y))
x)))
(let {[x (let {[x 5}} x)]} (+ x 4))
|
-
(30pts)
N4 adds functions and function-application to our language:
Expr ::= … | FuncExpr | FuncApplyExpr
FuncExpr ::= func(Id, Expr)
FuncApplyExpr ::= <Expr(Expr)> |
We restrict ourselves to unary functions (functions of one argument).
(To think about:
if we wanted to 'fake' functions of 2 arguments in N4,
how could we do it?)
Just like numbers are self-evaluating,
so are FuncExprs.
If evaluating (an internal representation of) a
FuncExpr,
just return that same (internal representation of the) function.
We won't actually evaluate the body until
the function is applied.
In FuncApplyExpr,
the first Expr had better evaluate to a function.
(That is, it might be a FuncExpr,
or an Id which gets substitued to a function value.
It could also be (say) an is0Expr or letExpr
which evaluates to a function.)
For example, here is a function for computing absolute value:
func( x, if x isNeg then (-1 times x) else x;)
|
A funcApplyExpr represents calling a function.
Here are two expressions, both computing the absolute value of -5:
<func( x, if x isNeg then (-1 times x) else x;)(-5)>
let abs = func(x, if x isNeg then (-1 times x) else x;)
in <abs(-5)>;
|
To evaluate a function-application
<Expr0( Expr1)>:
-
Evaluate Expr0; let's call the result f.
(f had better be a function-value!)
-
Evaluate Expr1; let's call the result arg.
-
Substitute f's parameter with arg in f's body;
call this new expression E′.
-
Evaluate E′ and return that value.
Hey, those semantics are practically the same as let!
Observe that our programs can now evaluate to either of
two types: numbers or functions.
In Java, we'll need to have a class which can represent
either,
as the return type for our eval.
You can paste/convert the
N3 java test cases
into the same format as the ones you've been using
at the bottom of N0.rkt.
Make test cases for
parse (optional),
toString and eval
involving making and calling the following functions:
-
A constant function that always returns (say) 17.
-
The absolute value function given above,
-
the function sqr, which squares its input,
and
-
The N4 equivalent of the following scheme definition make-adder:
(define (make-adder n)
(lambda (m) (+ n m)))
; Two examples of applying this function:
;
(make-adder 4) ; evals to (lambda (m) (+ 4 m))
((make-adder 3) 4) ; evals to 7
|
-
(35pts:
This problem and the next are really the crux of the assignment.)
Deferred evaluation:
N5 doesn't add any new syntax,
but it is a different evaluation model which
will give us more expressive semantics.
Use a new file/project for N5, separate from N0-N4.
There are two problems7
with the substitution approach used above:
we can't make recursive functions,
and it doesn't generalize to assigning to variables.
We solve these problems with deferred substitution:
-
When interpreting a program,
eval will take two arguments:
the program to interpret, and a set of bindings.
A binding is just an Id and its associated value;
a set of bindings (an environment)
might be implemented with an
association
list
or a java.util.Map<IdExpr,ValExpr>.
-
In this approach, function-application and let
no longer do any substitution —
instead,
each introduce (exactly one) new binding
and proceed recursively8.
-
Upon encountering an Id, we just look up its value
in our list of bindings.
(Recall that in N2, eval never actually
reached an Id;
in N4 it does.)
-
For full credit,
make sure that you can evaluate recursive functions.
-
You should not need to be doing any parse-tree-substitutions any more
(though you can if you want, to temporarily address the note below).
Note that in N6, no substitutions will be allowed.
A step sideways:
This algorithm as described lets us add recursive functions,
but it also doesn't handle some expressions that N4 did!
For example,
let make-adder = func(m, func(n, (n plus m)) )
in <<make-adder(3)>4>;
gives an error "unbound identifier: m" if no substitution has been done,
however
<func(m, <func(n, (n plus m))(3)>)(4)>
does still work just fine(!).
The problem will be fixed in N6:
in the first example, <make-adder(3)>
returns a function whose body involves m and n,
but not the binding of m to 3.
We'd need return the function and its bindings.
Note that this gives us dynamic scoping (which we'll mention in class):
let m = 100
in let addM = func(x, (x plus m))
in ((let m = 5 in <addM(3)>;)
plus
(let m = 4 in <addM(3)>;))
;
;
|
evaluates to 15, not 206.
-
(35pts) N6: Implement static scope.
-
You'll want to
modify your function-structures so that
they include one extra piece of information:
the bindings in effect when the function was declared.
Be sure to make a careful suite of test-cases,
to test for various ways of capturing bindings in different closures.
(For example, consider the
lecture on ways of using
let* to effectively implement objects, classes, and inheritance.)
-
When evaluating a function-application, use the bindings in effect
back when that function was declared, for its free variables.
-
You should not be doing any substitution.
To think about:
Hmm, when we first parse our expression, we'll
create function-expressions,
but (since we're not eval'ing) we don't have any bindings
right then.
Only later, when we eval a function,
will we actually know about any bindings
(since that call to eval was given a list of bindings)….
clarification:
This means that a function won't quite evaluate to itself anymore —
it'll evaluate to a struct that has the same parameter and body as the
original (parsed) structure,
but a different environment.
-
Note that getting recursive functions to work is a bit tricky,
since their environment (closure) needs to include themselves,
but when you're evaling a func-expr
you don't yet have its associated name, hmmm.
Just after the func-expr is created
then you'll want to go in and change its environment to
include itself.
The easiest way is to do this with mutation
(something like set-func-expr-env!
9).
Be sure to make a copy of your N5 project files before starting N6.
-
Further extra-credit options (of varying difficulty):
-
Add comments to the grammar and parser.
Make sure they nest.
It's up to you whether or not you store comments in the
internal representation, or discard them entirely10.
-
Re-write the code for evaluating let
so that it simply transforms it into a function-application, and evaluates that.
-
Generalize functions to multiple arity,
and/or
generalize let so that it takes any
number of id/value pairs.
(Really this would be like scheme's
letrec,
since N4 allows recursion.)
-
If you want, modify the let syntax:
instead of
let id = Expr in Expr;,
you can use
Id = Expr ; Expr ; .
(Note that this still represent declaring a new variable,
not mutating an existing one.)
Your programs now look like procedural programs.
-
Add mutation (i.e. assigning to Ids):
Id ← Expr;.
(If you want to use mutation in your underlying scheme implementation,
you can use set-first!
and
set-struct-field!11.
You can also think about how you might try implementing mutation in N6
without actually using mutation in your underlying program.)
-
Once we have assignment, add the equivalent of scheme's begin.
(You might use “{” and “}” to delimit
such a “block” of statements;
you now have implemented all the procedural concepts of
a Java or Python interpreter, including first-class functions!
And as seen in lecture,
you also could write
superficial transformations to get most of an object system as well.)
1
This is so we can just use our language's built-in number-parsing functions,
without getting bogged down in tokening input.
So scheme implementations will allow exactly those strings recognized by number?,
(including +nan.0, -inf.0, and 2+3i).
Similarly,
if using Java,
the semantics of N0's arithmetic will be similar
to IEEE floating point arithmetic
(rather than perfectly-correct arithmetic).
↩
2
Because we don't need to check for bad inputs,
it's fine to have your interpreter crash if b=0.
If you prefer to "control" crash — creating a meaningful error message
and calling error or throw yourself —
you are also welcome to do that.
↩
3
Note that our different implementations are now varying by
more than just precision of arithmetic:
in a Java implementation, NaN is a Num,
and in a scheme implementation it's an Id.
We won't use any test cases involving such subtle differences.
However, note how our choices in designing a new language
are being influenced by the language we're trying to easily
implement it in!
This stems from the fact that a primary design constraint on N1 is that
implementing an intepreter for N1 doesn't get bogged down in minutae when using
either Java or Racket.
↩
4
By the way: you will receive no credit
for doing substitution on a string -- all our
code should work on the parse tree itself.
String-substitution (like C pre-processor macros)
can't be generalized to handle shadowed variables (scope)
for N3,
and is in general
fraught with error.
A local-variable construct which requires
globally-unique names isn't very impressive!
↩
5
The notation
“let x = 5 in (x plus 3);
⇒ 5 plus 3 ⇒ 8”
is shorthand for
eval(parse("let x = 5 in (x plus 3);"))
= eval(parse("(5 plus 3)"))
= eval(parse("8"))
|
Observe how we definitely don't write
“"let x = 5 in (x plus 3)" = "(5 plus 3)" = 8”
since the two strings are not .equals(·) to each other,
and strings are never ints.
More specifically:
we distinguish between “⇒”
(“code evaluates to”)
and
“=” (“equals”,
just as “=” has meant since kindergarten).
↩6
nor any “binding occurrences”:
The first x in
let x = 5 in (x plus 3);
is a binding occurrence,
and the second x is a bound occurrence.
(We say that “a variable is bound inside the scope of its
binding occurrence”.)
↩
7
A third issue, pointed out in
Programming Languages and Interpretation,
is that
evaluating deeply nested lets
is an O(n2) algorithm.
↩
8
Note that the list/map you recur with has the additional binding,
but that recursive call shouldn't add/modify
the list/map used at the top level.
Since java.util.Map is inherently mutable,
you'll want to make a new copy of that map before recurring.
↩
9You'll have to use the keyword #:mutable
when you define-struct that structure,
to get the setter.
↩
10
You can think about which implementation you prefer, when
using a language:
when seeing the program's internal representation of your code
(debugging,
when debugging or seeing the compiled
↩
11
Be default, scheme structs are non-mutable;
to make them mutable, you have to provide the keyword #:mutable
to define-struct.
↩