# Merging, Folding, Monoids, and Foldable

I recently had need to merge a collection of sorted lists. The task is simple enough, but as usual, in Haskell it was particularly pleasant, and lead to some interesting discoveries about folding and monoids. The code is available here, if you want to fiddle with it as you read.

Start with simply merging two (presumed) sorted lists. This code is about as clear as it could be:

merge2 :: Ord a => [a] -> [a] -> [a] merge2 [] bs = bs merge2 as [] = as merge2 as@(a:at) bs@(b:bt) | a <= b = a : merge2 at bs | otherwise = b : merge2 as bt

To merge multiple lists, we can use a fold, which makes quick work of the task:

mergeLinear :: Ord a => [[a]] -> [a] mergeLinear = foldr merge2 []

This works, but is isn’t optimal. The nature of `foldr`

is such that the combing operation must be done in linear order down the list. The type signature demands it, as each element of the list can only be combined with a previous partial result:

foldr :: (a -> b -> b) -> b -> [a] -> b

Thus, the merges are performed in a linear cascade:

When a value is produced from the merge, on average, half the cascade will be traversed finding the next value. We can do better.

Because `merge2`

is an associative operation, the merges could instead be arranged in a tree:

The number of applications of `merge2`

is about the same: *n−1* vs. *n*. If the combining operation was something like `(+)`

, then there would be no significant difference between these two approaches. However, the iterative operation of `merge2`

is going traverse parts of structure of the computation repeatedly, and so it makes a rather big difference, traversing only *log _{2} n* nodes to find the next value each time, rather than

*n/2*. (We’ll measure this empirically, below.)

Let’s create this kind of fold:

mapPairs :: (a -> a -> b) -> (a -> b) -> [a] -> [b] mapPairs pair single = go where go [] = [] go [a] = [single a] go (a:b:cs) = pair a b : go cs foldPairwise :: (a -> a -> a) -> a -> [a] -> a foldPairwise _ z [] = z foldPairwise _ _ [a] = a foldPairwise f z as = foldPairwise f z $ mapPairs f id as

And finally:

mergePairwise :: Ord a => [[a]] -> [a] mergePairwise = foldPairwise merge2 []

This works, is just as elegant as the linear version, and is indeed more efficient.

Notice that the first two arguments to `foldPairwise`

are similar to the two components of a monoid. And we have already noted that `merge2`

is associative. This suggests that we recast this kind of a fold in terms of `Monoid`

:

foldPairwiseM :: Monoid a => [a] -> a foldPairwiseM [] = mempty foldPairwiseM [a] = a foldPairwiseM as = foldPairwiseM $ mapPairs mappend id as

And now, creating a `Monoid`

for merging, we get:

newtype Merge a = Merge { merged :: [a] } deriving (Eq, Ord, Read, Show) instance Ord a => Monoid (Merge a) where mempty = Merge [] (Merge a) `mappend` (Merge b) = Merge $ a `merge2` b mergePairwiseM :: Ord a => [[a]] -> [a] mergePairwiseM = merged . foldPairwiseM . map Merge

So, if our combining operation is part of a `Monoid`

, then we can use the associativity to create a different kind of fold, one which does not combine in a simple linear cascade. For operations which will traverse the structure of the computation multiple times, like merging, this can be a significant advantage (*log _{2} n* vs.

*n*).

There is already a type class that abstracts folding: `Foldable`

. We can encode our pairwise folding strategy as an instance:

newtype PairwiseFoldingList a = PairwiseFoldingList [a] deriving (Eq, Ord, Read, Show) instance Foldable PairwiseFoldingList where foldMap f (PairwiseFoldingList as) = foldPairwiseM $ map f as

Notice that it is implemented with `foldMap`

, rather than the alternative `foldr`

. This is because `foldr`

and related forms *must* fold in a linear cascade due to their type signatures. Only `fold`

and `foldMap`

can implement a fold that takes advantage of associativity. (Note: `foldr1`

and `foldl1`

don’t presume associative combining functions, even though their type signatures admit them.)

This leads to an interesting observation: If an instance of `Foldable`

doesn’t implement `foldMap`

then it looses the opportunity (if it has any) for a more efficient fold when a `Monoid`

is involved.

I looked through the entire Haskell Platform and found several `Foldable`

instances that didn’t implement `foldMap`

but would gain if they did: `HashMap`

, `HashSet`

, `Seq`

, and `Tree`

could all exploit their tree structure, and would only improve their performance. `Array`

, `Vector`

, and `[]`

could each use the pairwise strategy. It isn’t clear if these are worth it in all cases: There is some overhead in `foldPairwiseM`

over a simple `foldr`

. Perhaps approaches like `PairwiseFoldingList`

are best for these three.

We can, for completeness, now re-implement our two merges based on `Foldable`

:

mergeFoldable :: (Foldable t, Ord a) => t [a] -> [a] mergeFoldable = merged . foldMap Merge mergeLinearF :: Ord a => [[a]] -> [a] mergeLinearF = mergeFoldable mergePairwiseF :: Ord a => [[a]] -> [a] mergePairwiseF = mergeFoldable . PairwiseFoldingList

It would nice to validate that the pairwise strategy indeed performs fewer item comparisons in a merge. Our `Monoid`

and `Foldable`

based code makes this easy:

newtype CountedMerge a = CountedMerge { countedMerged :: [(a, Int)] } deriving (Eq, Ord, Read, Show) instance Ord a => Monoid (CountedMerge a) where mempty = CountedMerge [] mappend = countedMerge2 countedMerge2 :: Ord a => CountedMerge a -> CountedMerge a -> CountedMerge a countedMerge2 (CountedMerge as0) (CountedMerge bs0) = CountedMerge $ go as0 bs0 where go [] bs = bs go as [] = as go as@((a,an):at) bs@((b,bn):bt) | a <= b = (a,an+1) : go at bs | otherwise = (b,bn+1) : go as bt toCountedMerge :: [a] -> CountedMerge a toCountedMerge = CountedMerge . map (\e -> (e,0)) countedMergeFoldable :: (Foldable t, Ord a) => t [a] -> [(a, Int)] countedMergeFoldable = countedMerged . foldMap toCountedMerge countedMergeLinearF :: Ord a => [[a]] -> [(a, Int)] countedMergeLinearF = countedMergeFoldable countedMergePairwiseF :: Ord a => [[a]] -> [(a, Int)] countedMergePairwiseF = countedMergeFoldable . PairwiseFoldingList

Now we can run both the linear and pairwise versions, and see how many comparisons per item it took to merge them:

=== Comparison counts between linear and pairwise merges, sized 100 number of trials = 1000 avg. number of lists to merge = 49.92 avg. number of items = 2500.06 avg. countedMergeLinear count = 80466.73, 24.09 per item avg. countedMergePairwise count = 15302.56, 5.36 per item === Comparison counts between linear and pairwise merges, sized 500 number of trials = 25 avg. number of lists to merge = 237.40 avg. number of items = 60507.56 avg. countedMergeLinear count = 9702855.80, 118.29 per item avg. countedMergePairwise count = 510670.64, 7.66 per item === Comparison counts between linear and pairwise merges, sized 1000 number of trials = 1 avg. number of lists to merge = 950.00 avg. number of items = 483556.00 avg. countedMergeLinear count = 231494967.00, 478.73 per item avg. countedMergePairwise count = 4802432.00, 9.93 per item

You can see that the linear version makes about *n/2* comparisons per item produced, whereas the pairwise version only about *log _{2} n*.

In summary:

- s
- • Only
`fold`

and`foldMap`

can take advantage of associativity in the monoid to fold more efficiently. - • For some combining functions, like the sorted list merge, the advantage can be significant.
- • Some standard instances of
`Foldable`

should implement`foldMap`

for a more efficient fold strategy.

All the code for this, as well as a driver program with quick check tests can be found here.

Comments are closed.

This sounds very interesting, but I have to admit you lost me early on when you wrote

> When a value is produced from the merge, on average, half the cascade will be traversed finding the next value.

Can you elaborate why that is? I suppose that’s the lazy evaluation kicking in, i.e. getting the next element in order from the top merge2 call requires that all the other merge2 invocations have to process the next piece of data as well?

Yes, the laziness accounts for the way I described it: When a value of the outermost merge is demanded, the heads of the two inputs are demanded and compared. On the next time ’round, one of those two heads will already be computed, so only side that produced the previous value will need to step to compute the next head.

Think about how that works in a linear cascade vs. a tree. It might help to work out the merge of these lists by hand: [[3,13..], [2,12..], [1,11..], [4,14..], [5,15..]] each way.

You can see the benefit if the merges were strict also: If each list were on average m long, then in the linear cascade, the innermost merge traverses m + 0 elements, then the next one traverses m + m elements, the next m + 2m… and so on. This is m*n(n+1)/2 traversals. Whereas in in the tree form we have n/2 traversals of m + m at the bottom layer, n/4 traversals of 2m + 2m, and so on. A total of log n * (n*m).

I guess you are aware of http://hackage.haskell.org/package/data-ordlist ?

I wasn’t aware of data-ordlist. Thanks for pointing it out.

The function foldt is almost exactly the same in approach as my foldPairwise, though I factor out the utility mapPairwise.

mergeAll in that package has somewhat different preconditions than the merges here: It assumes the list of lists is already ordered by the head element. That is, mergeAll [[3,4],[1.2]] doesn’t produce [1,2,3,4]. On the other hand, it supports an infinite list of lists, which my merges don’t.

To clarify, your foldPairwise appears to be identical to foldt’. foldt uses a different shape of tree in order to support the infinite list of lists case, a shape not unlike Okasaki’s random-access lists.

Basically, it’s a (singleton element) `merge` (complete tree of merges of depth 1) `merge` (complete tree of merges of of depth 2) `merge` …

As for all the VIP business inside mergeAll, I don’t fully understand it myself at this point in time. Henrich Apfelmus came up with that, inspired by Dave Bayer’s solution to a very similar problem. It turns out that the use of VIPs isn’t strictly required to implement `mergeAll`, as pointed out by Will Ness, however VIPs is need in order to avoid forcing significant parts of the list of lists before it is necessary to do so, as I discovered later. I mostly just served as an editor and arbitrator on the mergeAll implementation.

The cool thing is that we can express merge sort (O(n*logn)) and insertion sort (O(n*n)) via pairwise and linear versions of the merge:

mergeSort, insertionSort :: Ord a => [a] -> [a]

mergeSort = mergePairwiseF . map return

insertionSort = mergeLinearF . map return