Friday, August 2, 2013

Sometimes less is more in language design

Haskell programmers commonly say that Haskell code is easy to reason about, but rarely explain why. This stems from one simple guiding principle: Haskell is simple by default.

Wait, what? Are we talking about the same language? The one with monads and zygohistomorphic prepromorphisms? Yes, I mean that Haskell.

For example, what does this type signature tell us:
x :: Bool
This type signature says that x is a boolean value ... and that's it! Type signatures are stronger in Haskell than other languages because they also tells us what values are not:
  • x is not a time-varying value
  • x is not nullable
  • x is not a String being coerced to a truthy value
  • x does not have any side effects
In other words, complexity is opt-in when you program in Haskell.

Imperative languages, object-oriented languages, and most other functional languages begin from a more complex baseline than Haskell. They all compete for which language provides the most built-in bells and whistles, because they all begin from the premise that more built-in features is better.

However, the problem is that you can't opt out of these features and you can't assume that any library function you call doesn't use all of them. This means that you must either rely on careful documentation like:
  • "This function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe."
  • "Throws: will not throw"
  • "The array is changed every time the block is called, not after the iteration is over"
  • "If x is not a Python int object, it has to define an __index__() method that returns an integer.
  • "Great care must be exercised if mutable objects are map keys"
... or you must carefully inspect the original source code.

Haskell takes a different design tradeoff: you begin from a simpler baseline and explicitly declare what you add. If you want statefulness, you have to declare it in the type:
-- Read and write an Int state to compute a String
stateful :: State Int String
If you want side effects, you have to declare it in the type:
takeMedicine :: IO ()
-- Ask your doctor if this medicine is right for you
If you want a value to be nullable, you have to declare it in the type:
toBeOrNotToBe :: Maybe Be
-- That is the question
Notice that there are some things that Haskell does not reflect in the types, like laziness and unchecked exceptions. Unsurprisingly, these are also two built-in features that people regularly complain about when using Haskell.

Technically, all these type signatures are optional because Haskell has type inference. If you hate pomp and circumstance and you just want to churn out code then by all means leave them out and the compiler will handle all the types behind the scenes for you.

However, you should add explicit type signatures when you share code with other people. These type signatures mentally prepare people who read your code because they place tight bounds on how much context is necessary to understand each function. The less context your code requires, the more easily others can reason about your code in isolation.


  1. I mostly agree, but it is not entirely true that Haskell gives by default less than other languages.

    For example, "x :: Bool" does not say that x is a boolean as is usually understand from other languages. Bool is a lazy boolean. You could have "x = undefined" and not realize until well into running a program using x that you didn't intend to crash.

    Also, lazy evaluation makes it harder to reason about time and space usage.

    Haskell has chosen certain tradeoffs that happen to be different from those chosen by other languages.

    (Also, there is unsafePerformIO.)

    1. I agree, particularly about the laziness part. This still supports the premise of the post, though: the parts that Haskell does not reflect in the types are precisely the parts that people complain the most about:

      * exceptions (especially asynchronous exceptions)

      * laziness

      * cheating (unsafePerformIO)

      Let me try to work that into the post for balance.

    2. The problem of unsafePerformIO used to annoy me too. But now we have SafeHaskell which addresses this problem.

      I suppose one could argue that in Haskell, undefined is the equivalent of null pointers in other languages. The difference is that null pointers are used in the normal functioning of bug-free programs. That is not the case for undefined in Haskell.

      As for space and time usage with laziness, I think that is mostly a function of what we are used to.


    also addresses this, contrasting with common lisp