Immutability and unboxing in array programming

27 September 2017 — by Manuel M T Chakravarty

This is the third post in a series about array programming in Haskell — you might be interested in the first and second, too.

In the previous post of this series, we discussed commonly used vector and matrix routines, which are available in highly-optimised implementations in most programming languages. However, often we need to implement our own custom array algorithms. To this end, the Haskell standard (Haskell 2010) already comes with a simple array API in the form of the Data.Array standard module. These arrays are immutable, boxed, and non-strict. This allows for the elegant, high-level description of many array algorithms, but it is suboptimal for compute-intensive applications as boxing and non-strictness, especially in combination with the reliance on association lists for array construction, lead to code that is very difficult for the Haskell compiler to optimise, resulting in rather limited performance.

Strictness

The extra expressiveness granted by non-strictness is nicely displayed by the following example, courtesy of the classic A Gentle Introduction to Haskell. The wavefront function defines an nxn array whose leftmost column and topmost row are populated with 1s (by the first two list comprehensions below). All other array elements are defined as the sum of the three elements to their left, top, and top-left (by the third list comprehension in the definition).

wavefront :: Int -> Array (Int,Int) Int
wavefront n = arr
  where
    arr = array ((1,1),(n,n))
                ([((1,j), 1) | j <- [1..n]] ++
                 [((i,1), 1) | i <- [2..n]] ++
                 [((i,j), arr!(i,j-1) + arr!(i-1,j-1) + arr!(i-1,j))
                             | i <- [2..n], j <- [2..n]])

The ! infix operator implements array indexing:

(!) :: Ix i => Array i e -> i -> e

Moreover, the function array constructs an array from an association list mapping indexes to values:

array :: Ix i => (i, i) -> [(i, e)] -> Array i e

The elegance of wavefront is in the recursive definition of the array arr. In the expression arr!(i,j-1) + arr!(i-1,j-1) + arr!(i-1,j), we access the elements to the left, top, and top-left of the current one by appropriate indexing of the very array that we are currently in the process of defining. Such a recursive dependency is only valid for a non-strict data structure.

Boxing

Unfortunately, the expressiveness of non-strict arrays comes at a price, especially if the array elements are simple numbers. Instead of being able to store those numeric elements in-place in the array, non-strict arrays require a boxed representation, where the elements are pointers to heap objects containing the numeric values. This additional indirection requires extra memory and drastically reduces the efficiency of array access, especially in tight loops. The layout difference between an unboxed (left) and a boxed (right) representation is illustrated below.

While both strict and non-strict data structures admit boxed representations, non-strict structures typically require boxing. To provide an alternative to the standard non-strict arrays, the array package provides strict, unboxed arrays of type Data.Array.Unboxed.UArray i e. By way of overloading via the type class Data.Array.IArray.IArray, they provide the same API as the standard non-strict, boxed arrays. However, the element type is restricted to basic types that can be stored unboxed, such as integral and floating-point numeric types.

Unfortunately, array construction based on association lists, such as the array function, still severely limits the performance of immutable UArrays. Nevertheless, at least, array access by way of (!) is efficient for unboxed arrays.

Immutability

While immutable arrays —i.e., arrays that cannot directly be in-place updated— are semantically simpler, it turns out that indexed-based array construction is drastically more efficient for mutable arrays. Hence, computationally demanding Haskell array code typically adopts a two-phase array life cycle: (1) arrays are allocated as mutable arrays and initialised using in-place array update; once initialised, (2) they are frozen by making them immutable.

This usage pattern is supported by the interface provided by the Data.Array.MArray.MArray class and we use it in the following example function generate to initialise an array of Doubles with an index-based generator function gen:

generate :: Ix i => (i, i) -> (i -> Double) -> UArray i Double
generate bnds gen
  = runSTUArray $ do
      arr <- newArray_ bnds
      mapM_ (\i -> writeArray arr i (gen i)) (range bnds)
      return arr

Mutable arrays come in various flavours that are, in particularly, distinguished by the monad in which the array operations take place. Usually, this is either IO or ST, and the array package provides both boxed and unboxed variants for both monads. We have the boxed IOArray and the unboxed IOUArray as well as the boxed STArray and the unboxed STUArray. The above definition of generate uses STUArray to initialise the array, and then, freezes it into a UArray, which is returned. The choice of STUArray is implicit in the use of runSTUArray, which executes the code in the state transformer monad ST and freezes the STUArray into a UArray.

The function newArray_ provides a fresh uninitialised array:

newArray_ :: (MArray a e m, Ix i) => (i, i) -> m (a i e)

We can read and write a mutable array with

readArray  :: (MArray a e m, Ix i) => a i e -> i      -> m e
writeArray :: (MArray a e m, Ix i) => a i e -> i -> e -> m ()

In the definition of generate, we use writeArray once on each index in the range range bnds of the mutable array to initialise the value at index i with the value of the generator function at that index, gen i.

Generally, we can freeze a mutable array, obtaining an immutable array, with

freeze :: (Ix i, MArray a e m, IArray b e) => a i e -> m (b i e)

There is also the unsafe variant unsafeFreeze that avoids copying the array during freezing, but puts the onus on the programmer to ensure that the mutable argument is subsequently not updated anymore. In the code for generate above, we indirectly use unsafeFreeze by way of runSTUArray. As runSTUArray makes it impossible to use the mutable array after freezing, this encapsulated use of unsafeFreeze is always safe.

An expression, such as, generate (1,100) ((^2) . fromIntegral) produces an unboxed array with the first one hundred square numbers. Internally, it is based on the initialisation of a mutable array, but this is completely abstracted over by the definition of generate. While there is no inbuilt need to ever freeze a mutable array, good functional programming style requires to keep mutable code as localised as possible and to avoid passing mutable data structures around.

Summary

Well written code based on unboxed arrays and using the discussed pattern to create arrays by initialising a mutable version, which is subsequently frozen, can achieve performance comparable to low-level C code. In fact, the collection-oriented high-performance array frameworks that we will discuss in subsequent blog posts work exactly in this manner.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← GHC compiler plugins in the wild: typing Java How to compose streaming programs →