home—lectures—recipe—exams—hws—D2L—zoom (snow day)
immutable data
pros and cons
In the course of writing our videogame, we (hopefully) learned:
-
The template for processing lists
-
model-view-controller — a very obvious distinction.
The View is all of our draw-stuff functions;
the Controller is our event-handling functions (stuff-handle-key, and the "tick-handlers"
move-stuff/update-stuff).
The model is the actual structs,
along with functions like frog-overlap-truck?
and game-over?.
-
How to write a videogame with immutable data — never reassigning to a single variable or field!
In particular, this approach is called “world passing style” —
we have a (say) list of trucks, give it to a function, and get back
a whole new list of trucks.
Or, a world struct, which big-bang passes to various
event-handler functions, and gets back the new/updated world struct.
Immutable data: disadvantages
Compare move-truck with an imperative version that just re-assigns to a field.
This requires much more memory (allocate a whole new truck),
and more time (we have to copy all the fields over, rather than just assign to one field).
For a struct with 20 fields, this is a slowdown of perhaps 20x, plus more time for memory allocation and garbage collection.
Certainly, this seems expensive, but a we can also make the following observations:
-
People often mis-estimate where delays come from:
programs almost always keep up with my keystrokes more than fine —
it’s web-connections that really slow things down to the degree that I notice a delay.
-
In practice, our video-game ran just fine — a few hundred trucks/aliens/bricks/etc
doesn’t come close to taxing a modern system.
You can take your video-game assignment and scale up the number of
objects, and see how it plays.
(Any lagginess in the assignment is usually based on only moving 1 pixel per keypress,
not actual CPU run-time!
That said, the image-library is not designed for time-efficiency.)
In real life, rather than fear or suspect that an approach won’t scale,
I encourage you to implement something both ways
and actually measure the difference.
The slower approach might end up being too slow, but then again maybe it’s not a
bottleneck,
and you have saved dozens of development & debugging hours by avoiding useless optimization.
(And even if not useless: how many times must your program run milliseconds faster,
just to recoup the extra hours and hours of development time and cost?)
- Measuring the effects is better than just imagining performance differences:
- A rule-of-thumb is that (imperative) python programs tend to be a factor of 5 slower than
a C program, and (mostly-functional) racket programs are a factor of 2-3x slower than a
python program.
- A confounding factor: dynamically-typed languages require more run-time checks, which slows performance.
- See {some actual benchmarks.
- Haskell, a statically-typed pure-functional language, often approaches C
in efficiency. But there do exist some problems where it doesn’t (can’t?).
- Again, most programs you write are not computation-bound these days,
so efficiency trade-offs are as important as people tend to think.
When measuring tradeoffs, ignore differences less than 100ms for typical-size inputs!
In my experience,
if I notice a program is running slowly, it is almost always a network-delay, not a CPU issue.
Further
discussion
}
-
Immutable data can mean better memory-sharing, w/o worrying about aliasing.
This might (slightly) ameliorate the increased memory overhead.
to do in class::
Draw memory-diagram of sharing in lists, and trees.
-
How much extra memory is needed?
At most a factor of two, in the world-passing style.
When passing in (say) a list-of-trucks and getting a new “modified” list back,
it’s likely that the old list is now unreachable, and can be garbage-collected.
You can imagine that there is enough memory for two copies of the list, and the
program alternates between memory chunk holds the current version,
and which chunk is being used to write the udpated version into.
-
If you need to distant functions to convey information to each other, they can do so
with global variables.
(This is a strength, and a weakness.)
See the next point.
-
To be purely functional,
you tend to need more inputs to functions,
and want/need to return multiple values.
This can be tedious, especially if the language doesn’t provide support.
Though this also is precisely why such code is harder to reason about — it’s unclear
what changes in distant code might change this procedure’s behavior.
If you hear people talk about “dependency injection”, they mean
“pass in extra arguments (like database-connections), rather than using global state”;
people feel this is a win even in imperative programming,
which suggests this negative is not so big after all.
Immutable data: advantages
Secure Programming
Mutability can become a security hole.
(We'll assume we're using Java, for the moment.)
Imagine the following:
In an operating system, you certainly want to know who all your users are —
String[] usernames = { ... };.
And of course you wouldn't make this a public-field/variable, because you don't want other code
to assign to it!
But you may want to let other people know all the users, so it might be plausible
that you'd include a getter, String[] getUsernames() { return usernames; }.
What's the problem? Aliasing + Mutability!
An attacker can go ahead:
String[] theUsers = getUsernames();
getUserNames[2] = "hax0r";
|
at this point, the OS now thinks that hax0r is a known, registered user — oops!
Secure Programming principle:
Don't give references to your own data-structures, to untrusted code!
Instead, make a copy.
So we'll change our function to be
String[] getUsernames() { return Arrays.copy(usernames); }.
Now an attacker can get a copy of the array, and if they assign into the array
that's fine, they're only assigning to their own copy.
But our problems may not be over yet!
If we are in a language like C (or many others), strings are mutable.
So an attacker can still weasel their way into the system's trust:
String[] theUsers = getUsernames();
getUserNames[2][0] = 'h';
getUserNames[2][1] = 'a';
getUserNames[2][2] = 'x';
getUserNames[2][3] = '0';
getUserNames[2][4] = 'r';
getUserNames[2][5] = '\0';
|
Secure Programming principle, cont.:
Remember to make a deep copy!
Note that Java is not susceptible to this latter attack. Why not?
Java's Strings are immutable!
home—lectures—recipe—exams—hws—D2L—zoom (snow day)
 This page licensed CC-BY 4.0 Ian Barland Page last generated | Please mail any suggestions (incl. typos, broken links) to ibarland radford.edu |
 |