Featured image

Programming languages are sometimes categorized into expression-oriented or statement-oriented. Statements typically do something, whereas expressions produce values.

These two categories are not clear-cut, and languages can and do support both styles but tend to lean one way or the other. As a rule of thumb we can understand the difference, and categorize the language, by taking a look at how one writes conditionals (a.k.a. if statements) in the language.

Statement oriented if Link to heading

A very common form of the if statement that you know and love enables you to choose which of two actions to perform:

# This is Python code
if shouldGenerate:
  print("yes")
else:
  print("no")

This is of course very useful, but if we think about it, in this particular snippet we repeat the call to print, which in a way is a violation of the DRY principle. So in some cases, especially if this call was more complicated than a single parameter, you’d prefer to instead calculate the parameters once, and then call print once:

if shouldGenerate:
  importantStuff = generateStuff(42)
else:
  importantStuff = loadStuffFromFile("some.stuff")
printStuff(importantStuff, otherParams, Formatting.INDENTED)

But now, we instead introduced a different kind of duplication (triplication here even, and quadruplication if you’d need to declare the variable first like in TypeScript) when we need to give the intermediate value a name - and as the joke goes, “the two hardest problems in programming are naming things, cache invalidation and off-by-one errors.”

Expression oriented if Link to heading

If we could treat the if as an expression instead, then we could drop one repetition of the name.

importantStuff = if shouldGenerate:
  generateStuff(42)
else:
  loadStuffFromFile("some.stuff")
printStuff(importantStuff, otherParams, Formatting.INDENTED)

Note that the above is in fact not valid Python. However there’s an expression variant of if that looks slightly different;

print("yes" if shouldGenerate else "no")

which enables us to write valid Python where we can assign the value of the if-expression to a variable for later use;

importantStuff = generateStuff(42) if shouldGenerate \
else loadStuffFromFile("some.stuff")
printStuff(importantStuff, otherParams, Formatting.INDENTED)

Note as an aside that Python’s syntax does not allow us to spread the if expression variant over multiple lines without using the backslash which hints that they didn’t want to encourage writing overly complicated expressions.

If we switch to Scala 3 - an expression oriented language - we don’t even have both a statement form and expression form of if, just the expression form.

// This is Scala 3 code
val importantStuff =
  if shouldGenerate then
    generateStuff(42)
  else
    loadStuffFromFile("some.stuff")
printStuff(importantStuff, otherParams, Formatting.Indented)

We can even avoid naming the conditional expression entirely by inlining it directly at parameter position;

printStuff(
  if shouldGenerate then generateStuff(42)
  else loadStuffFromFile("some.stuff"),
  otherParams,
  Formatting.Indented
)

In fact, you commonly encounter much more complex nested expressions in Scala, which is kind of a readability problem.

It turns out that naming intermediate values can in fact make things easier to understand! It is easy to go overboard with expressions in languages like Scala.

The value of nothing Link to heading

But how, you might ask, do languages (like Scala) with only expression oriented conditionals cope with the lack of the statement form, and the need to choose which of two actions to perform?

There’s two parts to the answer. The first is that you are typically not forced to use the result of an expression in expression-oriented languages, so it is perfectly valid in Scala to just call println (which doesn’t return anything useful anyway) in each arm of the conditional;

if shouldGenerate then
  println("yes")
else
  println("no")

The second part of the answer is uniformity! To understand why, let’s look deeper. As we shall see, we can make some useful generalizations of our code if we are not forced to treat functions differently depending on if they do or do not return something.

Uniformity Link to heading

In Scala and many other functional languages, there is commonly (and somewhat paradoxically) a special value (and a corresponding data type) representing ’no useful value’ which is used (sometimes implicitly) as the return value when there’s nothing else useful to return.

So in effect, there’s no such thing in Scala as a function that does not return a value, whereas in many languages (C, C++, Java, C# and Pascal comes to mind), functions that do return values are different from functions that don’t - explicitly declared as function or procedure in Pascal, or declaring a void return value in C, C++, Java or C#.

In Scala, this ’no useful value’ is of a type called Unit, which only has a single possible value, written as (), which is used when there’s no other useful value to return.

I’ve used Scala for years, and used Unit all the time, but can’t remember needing to explicitly write out the () more than a handful of times.

In Python, the corresponding value is None which is implicitly returned from any function if you don’t do an explicit return, which is also the case in Scala.

But that’s no different from using void, is it? Link to heading

It is in fact very different. Let’s look at an example:

Assume we have a function PerformHugeCalc that performs some calculation, producing a very large result needing a lot of memory, and instead of returning the result to you (let’s say, because it would need to be sent to you over a network), we want to write a generic function CondenseHugeCalc which you are supposed to give a “condenser” function, that “condenses” the value in some useful manner to something smaller, and then CondenseHugeCalc should return this much smaller representation as it’s result. Possibly our function could write the huge result to a file, and then just return the file name.

In C# we’d write something like this, assuming for now that the condenser returns a string (i.e. the type of the condensed value is string);

// This is C# code
string CondenseHugeCalc(
  Func<HugeResult, string> condenser,
  HugeParams hparams) 
{
  var huge = PerformHugeCalc(hparams);
  return condenser(huge);
}

// usage, assuming we have a method
// string WriteToFile(HugeResult result);
// and a chat object that can send data somewhere
chat.Send(CondenseHugeCalc(WriteToFile, hparams));

This is all fine and dandy, so far. What if we want to make CondenseHugeCalc generic over the return type of our condenser?

TRes CondenseHugeCalc<TRes>(
  Func<HugeResult, TRes> condenser,
  HugeParams hparams) 
{
  var huge = PerformHugeCalc(hparams);
  return condenser(huge);
}

// usage, assuming we have a method
// int ReportLength(HugeResult result);
chat.Send(CondenseHugeCalc(ReportLength, hparams));

“I can do this!” - C# just giggles at how easy this is. That is, until it spits out its milk when we try to pass in a void function (say, void PlaySound(HugeResult result); that plays the HugeResult as a sound without also returning a value).

The curse of the void Link to heading

Passing in a void function results in Error: the type arguments for 'CondenseHugeCalc' cannot be inferred from the usage. Try specifying the type arguments explicitly. Following the advice leads to two more errors;

  • Error: Keyword 'void' cannot be used in this context
  • Error: The type 'void' may not be used as a type argument.

This is because in C# and many other languages the void type is special, variables (or type arguments) cannot have the type void, it is only allowed to distinguish functions without a return value, and you are not allowed to construct a Func<HugeResult, void>. Therefore C# is forced to add a different type, Action, to represent types of void functions, and use Action<HugeResult> as a unfortunate substitute for the invalid Func<HugeResult, void>. We cannot modify CondenseHugeCalc to take an Action<HugeResult> however, as then we wouldn’t be able to pass WriteToFile or ReportLength anymore.

Yes, we could add an CondenseHugeCalc overload that takes Action and returns the empty string, or write a trivial wrapper function for PlaySound, which calls PlaySound, and returns the empty string. But what fun is that when there’s a perfectly fine “my type system is better than yours” argument going on? There’s all sorts of possible workarounds to problems, but we’re discussing the issue itself.

It is at this point I envision Scala saying, “Hold my Cognac” and proceed to present this;

def condenseHugeCalc[TRes](condenser: HugeResult => TRes, hparams: HugeParams): TRes = {
  val huge = performHugeCalc(hparams);
  return condenser(huge)
}

// usage in Scala; assuming we have a method
// playSound(result: HugeResult): Unit
chat.send(condenseHugeCalc(playSound, hparams));

This works regardless of condenser return type, Unit or otherwise. Uniformity!

What would be sent over chat in this case, depends on the implementation of send.

So, in conclusion, Unit is emphatically not the same as void as the former allows for uniformity and the latter unnecessarily forces a division onto the type system, as evidenced by the Func/Action divide in C#.

Discarding the value of an expression Link to heading

Ignoring the result of an expression (any expression, not just in conditionals) is fine in Scala and many other languages even if your function returns something important, maybe info about a monetary transaction that must not be lost.

It is then generally assumed that you wanted to execute the function for its side-effects (i.e. say it stored the transaction info to a file first, and sometimes you don’t care about the returned value, only that the transaction is now stored). Ignoring the result of a pure function (having no side-effects) is just wasteful.

Pros and cons Link to heading

Discarding results Link to heading

Some languages acknowledge that discarding the result is possibly a mistake, and in Rust, you can annotate a function with the #[must_use] annotation to disallow throwing away the result of the function. In C++17 there’s a corresponding [[nodiscard]] attribute.

For Scala, I’m not aware of any annotation, but you can compile your code with -Wvalue-discard and -Werror in which case discarding any ‘important’ (non-Unit) result would be flagged as an error.

Some linting tools for various languages allows to flag this potential mistake as well.

Readability vs composability Link to heading

Another common criticism of expression orientation is that it can hurt readability; statements have the advantage of having a clear beginning and end, whereas expressions can be endlessly combined.

However, the composability of expressions is also seen as a strength of expression oriented languages.

Mixing up assignment and equality Link to heading

When assignment is also a valid expression, accidentally mixing up assignment and equality checks is a source of bugs, because you might do (example in C):

if(currentAnswer = correctAnswer)
    printf("Very good?!");

which is valid C; but mistakenly modifies currentAnswer, and you probably meant to check for equality;

if(currentAnswer == correctAnswer)
    printf("Very good!");

This potential confusion is one reason some languages don’t have assignment expressions (only assignment statements) or disallows use of an assignment expression where a boolean expression is expected, even if it sometimes convenient to do:

# This is valid since Python 3.8
if match := re.search(pattern, text):
    print("Found:", match.group(0))
elif match := re.search(otherpattern, text):
    print("Alternate found:", match.group(0))

which is valid Python since 2019, when the assignment expression (:=) was introduced, to simplify cases like:

match = re.search(pattern, text)
if match:
    print("Found:", match.group(0))
else:
    match = re.search(otherpattern, text):
    if match:
        print("Alternate found:", match.group(0))

This section from the proposal to add assignment expressions to Python is informative;

Why not just turn existing assignment into an expression? C and its derivatives define the = operator as an expression, rather than a statement as is Python’s way. This allows assignments in more contexts, including contexts where comparisons are more common. The syntactic similarity between if (x == y) and if (x = y) belies their drastically different semantics. Thus this proposal uses := to clarify the distinction.

Paradigm comparison Link to heading

In general, functional programming languages are expression-oriented, whereas imperative and object-oriented languages tend to be statement-oriented, but of course some languages are multiparadigm which muddies the water.

Lesser known paradigms are harder to classify, but concatenative languages are definitely expression oriented as they mostly do not have ‘statements’ - all words are expressions that operate on implicit parameters and compose well with other words to form more complex expressions. Concatenative languages are often stack-based where parameters are popped off the stack and results are pushed onto the stack. Forth was probably the first such language, but there are more modern variants like Factor. The array-oriented languages such as APL, J or Uiua are also using a concatenative, expression-oriented style with implicit parameters.

You might think of SQL as statement oriented, since the terminology is that everything is ‘statements’, but from our point of view, the core statements of SQL are actually expressions, as you can combine say a SELECT statement with a DELETE statement, and the combination is also an expression as you can output the deleted rows to combine even further;

DELETE FROM orders
WHERE id IN (
  SELECT id FROM orders
  WHERE amount = 0)
RETURNING id

Assembly languages are at the other extreme, being purely statement oriented as there are no way to build compound expressions, short of using macro-assemblers.

My preference Link to heading

I prefer expression-oriented functional languages over statement based imperative languages, as the former are more expressive, which in this context sounds like a joke, but is due to expressions having fundamentally better composability than statements. The result of a calculation can be fed as input into another without being overly verbose about the intermediate steps.