Why study programming languages?

You can never understand one language until you understand at least two.
—John Searle

The limits of my language are the limits of my world.
—Ludwig Wittgenstein

Those quotes are about natural languages but can apply equally well to programming languages. Here some video: specific, real-world examples (14m12s) supporting those quoted claims.

Why study programming languages?

learn other approaches; lets you understand your native tongue better;
practice at learning new technologies/ideas;
makes you think about: when is language helping, and when getting in the way; e.g. sorting: scheme vs Java
Help you choose a language appropriate to the task

Languages steer programmer’s decisions

Java encourages choosing bad integer arithmetic

2 billion + 2 billion: in Java, python, and racket. By looking at other languages, it’s clear this is by no means an intrinsic limitation on computers.
Aside, which languages support: 2_000_000_000? (And: Is this a language feature that is helpful? Might it lead to surprising results sometimes?) My own answers: Yes, and no. Seeing this use of underscores, are there other "corner cases" you want to check?
Things that occur to me:
- 2_00_000, or even just 2_3
- 2_____3
- _23_
for later: We'll skip mentioning this in lecture, and revisit it when we discuss vocab-terms, namely literal.

By the way, observe how we enter the numbers with an underscore, but they don't print out that way. There are three things happening when you type in 234: the input-string is read, and the multiple characters are converted into a single number (more precisely: the internal representation of a number). Then arithmetic is done (if any) to get a result, and here that result needs to be printed, so the language calls toString on that number. The input and output numerals are superficial, and the language works internally with the numbers.

Another example: type in the string 0x77359400 (a numeral), and it's unsurprising that the printed result is not expressed in hex.
How to add properly in Java? Just Use java.math.BigInteger. Which is easier? What does the language encourage you to do, for integer arithmetic?
Why did Java designers choose to make limited-arithmetic the default, and correct-arithmetic difficult?
Does it matter? Ask Boeing about the latest, greatest 787 Dreamliner re-booting. The Y2K problem is a version of integer-overflow. Integer overflow can be a security risk, if large, malicious input can overflow a counter (e.g. 20-year-old bug long present in most all decompression software).
Does it matter? Ask the UK about how using an array, instead of a list, caused the gov't to be unaware of thousands of CoVid-19 cases:
The extraordinary meltdown was caused by an Excel spreadsheet containing lab results reaching its maximum size, and failing to update. Some 15,841 cases between September 25 and October 2 were not uploaded to the government dashboard.

As well as underestimating the scale of the outbreak in the UK, critically the details were not passed to contact tracers, meaning people exposed to the virus were not tracked down.
…
The problems are believed to have arisen when labs sent in their results using CSV files, which have no limits on size. But PHE then imported the results into Excel, where documents have a limit of just over a million lines.

The technical issue has now been resolved by splitting the Excel files into batches.

Java, python encourage bad rational arithmetic

Consider: 7 ÷ 25 · 25.
Also: 2.5 + x - x + 2.5.

Java, python, racket encourage bad irrational arithmetic

What is √2 · √2 ? (Give the racket expression.)
In sage:

sqrt(2)
n(sqrt(2))
n(sqrt(2), digits=50)
sqrt(2) * sqrt(2)

Note a different language issue, in that third example: we provided the argument “n(…, digits=50)”, rather than just passing 50 as the second argument. That’s “keyword arguments”; why/when might it be helpful? (It turns out python ¹ allows this, and sage is built on python.)

Why the difference?

You've probably guessed that the different behaviors above stem from how the data is represented internally. Java (and many languages) represent primitive ints and floats using a fixed 32-bits (and using 64-bits for long and double). If a number can’t fit in 32-bits, it can’t be represented by that type.

For arbitrary-precision integers, you can keep track of a list-of-digits ² This is what racket and python do by default, and what Java uses java.math.BigInteger for.

aside: Though in some implementations even bignums might have a most-positive-bignum.

In the same vein, representing rational numbers exactly is straightforward, if you use a struct with two fields (the numerator and denominator). Using a fixed-point number to approximate the correct answer is often an insignificant error, but errors can accrue (perhaps causing your missile to kill 28 of your own soldiers)-:.

(In dynamically-typed languages: the machine representation might include a tag, which requires a few bits. For example, a 32-bit quantity starting with the bits 101 might happen to let the language know that this quantity is the start of a string. For common types, a shorter tag might be used; since int is a very common type, a single leading bit of 0 might indicate a small-integer (one that fits in the remaining bits), and that lets the language-implementation use existing integer-arithmetic-hardware for that value!)

video (20m02s)

Java encourages you to use arrays (even when lists are better)

Suppose you want to keep list of temperatures on recent days (in Celcius, of course).

List<Integer> nums = new ArrayList<Integer>();
nums.add(22);
nums.add(17);
nums.add(24);
nums.add(30);
nums.add(5);

if (nums.contains(n)) {
  ...
  }

Oh my goodness!

What if you want to use an array? Java makes that much easier for you:

Integer[] nums2 = {22, 17, 24, 30, 5};

So if you want to sort a bunch of items, it’s concise to use an array, and verbose to use a List — Java is encouraging the programmer to use an array, even if a List is more appropriate.

It’s worth mentioning that although Arrays are easy to declare, iterate over, and statically-initalize in Java, they're also a bit underdone: there are many useful functions for arrays that Java didn’t include (not even when they added the utility-class java.util.Arrays); a contains method is one of them (!). You can download a third-party libraries (like guava or apache-commons (written since others were fed up with this shortcoming), or you can often convert the array to a list, and then use the standard java.lang.Collections methods:

Integer[] nums2 = {22, 17, 24, 30, 5};
List<Integer> nums1 = java.util.Arrays.asList( nums2 );

Recent versions of Java finally allow even more succinct phrasings, which finally rival the conciseness of their array-syntax using mere method-calls:

List<Integer> nums1 = java.util.Arrays.asList( new Integer[] {22, 17, 24, 30, 5} ); // creates an *immutable* list!

// In Java 10:
var nums1 = java.util.List.of( 22, 17, 24, 30, 5 );  // creates an *immutable* list!

Warning:
Java's asList method is documented as “Returns a fixed-size list...”. If you take its result and call (say) the .add method, it throws an UnsupportedOperationException — This is unintuitive behavior! When a method says it returns a List, the standard expectation is that it returns an object which supports the methods of interface List — that's what (abstract) types are all about! A developer shouldn't have to memorize which of the umpteen java.util.List methods create Lists that don't actually support all the List methods.

If they had named the method asImmutableList, it would make sense. Or if they just had the method return an actual List (and copy the backing array), it would make sense. They decided to favor both runtime-performance and short names, at the expense of violating both the Principle of Least Surprise and the good-O.O. practice of avoiding UnsupportedOperationExceptions. (Also, the designers might have made a interface ImmutableList, to avoid a plethora of UnsupportedOperationExceptions.)

Lisp encourages you to use lists (even where arrays may be better)

While Java priviliges strings and arrays with particularly simple syntax and rules, Lisp priviliges lists (in fact, its name stems from “List Processing”). Instead of constructing (list 320 'ibarland 2018 'fall), one can write '(320 ibarland 2018 fall). This gets more convenient when you need lists-of-lists: instead of

(list (list 320 'ibarland 2018 'fall)
      (list 320 'nokie 2018 'fall)
      (list 380 'ibarland 2017 'fall)
      (list 220 'jchase 2018 'spring))

you can use quote (') to write

'((320 ibarland 2018 fall)
  (320 nokie 2018 fall)
  (380 ibarland 2017 fall)
  (220 jchase 2018 spring))

(And we won't even think about how to represent this data in idiomatic Java; it involves a new classes and enumerated types and constructor calls, in addition to the list-creation itself.)

Furthermore: Lisp families have, in addition to quote, a back-quote (`) which lets you do even more!

Note that if using arrays (“vectors” in Lisp), there is a “long” constructor (using (vector 4 "hi" 5) analagous to (list 4 "hi" 5)), but there is also a shorthand (using #'[4 "hi" 5] analagous to '(4 "hi" 5)). BUT: there are so many built-in functions for handling lists, and so many libraries which uses lists as arguments and/or return-types, that in a LISP language I am sorely tempted to use lists even if I acknowledge that vectors are a more appropriate data-type ³. Arguably the push to use this approach is not coming from the language per se, but rather from its 3rd-party libraries. However, that is still a real force at work in real languages.

python

Special mention goes to python, for their convenient and mnemonic shorthand-syntax for some particularly-common (and useful) data types:

lists (using square-braces [4, 'hi', 5]),
tuples (using parentheses (4, 'hi', 5)),
(think of tuples as immutable-list-of-fixed-size; useful for keeping track of a professor's name, office, and phone-number without making a whole new class just for that.)
sets (using curly-brackets {4, 'hi', 5}), and
dictionaries (using curly-brackets and colons, { 4: 'quatro', 'hi': 'ola', 5: 'cinco' }).

And most of the most-useful standard library functions are overloaded to accept any of these data types. This combination is what makes pythonistas so ardent.

And a reminder that by seeing what other languages make easy or difficult, you gain the ability to see what your “primary” language is making easy or difficult — “You can never understand one language until you understand at least two.”.

¹ Keyword arguments are allowed in racket, but they aren’t automatic like they are in python. On the other hand, keywords are values, so you can pass/return keywords from functions. ↩

² Where each individual “digit” is perhaps a 32-bit quantity, or so. ↩

³ Sure, there are functions which will convert a list to a vector and vice versa. However, if I need to constantly convert back and forth, I still have the tempation to just stick with using lists throughout. ↩

This page licensed CC-BY 4.0 Ian Barland
Page last generated Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu