Sunday, November 20, 2016

Combinatorial Problems in Haskell

A Simple Exercise

"Given three sets of characters, build all the possible words of 3 characters, with the first character from the first set, the second character from the second set, and the last character from the last set."

This is an exercise adapted from a similar one on the Haskell Programming from first principles book, and the results are part of the work done during the Haskell Day with my group of Haskellers!

{-# LANGUAGE OverloadedStrings #-}

module Combinatorial where


set1 :: [Char]
set1 = "ab"

set2 :: [Char]
set2 = "12"

set3 :: [Char]
set3 = set1

The Recursive Solution

In a classical imperative solution we would write three nested loops trying all combinations. In Haskell we can replace loops with recursion.

allWordsOf3Chars :: [Char] -> [Char] -> [Char] -> [String]
allWordsOf3Chars set1 set2 set3 = combine1 set1 set2 set3
 where

   combine1 [] _ _ = []
   combine1 (c1:cs) set2 set3 = combine2 c1 set2 set3 ++ combine1 cs set2 set3

   combine2 _ [] _ = []
   combine2 c1 (c2:cs) set3 = combine3 c1 c2 set3 ++ combine2 c1 cs set3

   combine3 _ _ [] = []
   combine3 c1 c2 (c3:cs) = [[c1, c2, c3]] ++ combine3 c1 c2 cs
We test it
test1 = allWordsOf3Chars set1 set2 set3
  > test1 
  ["a1a","a1b","a2a","a2b","b1a","b1b","b2a","b2b"]

The KISS Solution

In Haskell list comprehension has a combinatorial semantic, and so we can write simply

allWordsOf3Chars' set1 set2 set3
  = [ [c1, c2, c3] | c1 <- set1, c2 <- set2, c3 <- set3 ]
isOk1 = test1 == (allWordsOf3Chars' set1 set2 set3)
  > isOk1
  True

The KISS solution is very easy to read, and very fast to write. A good selling point for the Haskell language.

The Advanced Solution

In Haskell the Applicative instance of List, has a combinatorial semantic, so we can write
allWordsOf3Chars'' set1 set2 set3
  = (\c1 c2 c3 -> [c1, c2, c3] ) <$> set1 <*> set2 <*> set3
isOk2 = test1 == (allWordsOf3Chars'' set1 set2 set3)
  > isOk2
  True

The Extended Exercise

We can generalize the first problem, passing a list of sets of chars, instead of only 3 sets.

The Extended Recursive Solution

In an imperative language, generalizing the solution is not simple, because we have a number of loops that is not fixed.

Generalizing the Haskell recursive solution is apparently simpler, because we start already from a recursion, but in practice it is rather hard to figure out the right algorithm.

combineWords :: [String] -> [String]
combineWords sets = combine [] sets 

 where

   combine :: String -> [String] -> [String]
   combine decidedPart [] = [decidedPart]
   combine decidedPart (set1:sets) = choices decidedPart set1 sets 

   choices :: String -> String -> [String] -> [String]
   choices decidedPart currentSet otherSets =
     concatMap (\c -> combine (decidedPart ++ [c]) otherSets) currentSet
isOk3 = test1 == (combineWords [set1, set2, set3])
  > isOk3
  True
This algo works because:
  • it has a main loop on each set
  • the inner loop selecting one character from the set, calls again the main loop function for obtaining all the possible solutions with the remaining sets
It is a double recursion , and double recursion is often hard to understand.

Writing this algo required too much time respect the initial planned time, and I can not declare that writing it in Haskell is a very easy task.

The extended KISS Solution

Apparently the previous KISS solution based on list comprehension, can not be extended easily to the new version of the problem.

The Extended Advanced Solution

I were in a meetup, so I asked help to someone more knowledgeable than me, and his solution was shocking
combineWords''' :: [String] -> [String]
combineWords''' sets = sequence sets
isOk4 = test1 == (combineWords''' [set1, set2, set3])
  > isOk4
  True
I'm not completely sure of the reason this algo works. I need to study better Haskell, and the magic behind sequence.
  >:t sequence
  sequence :: (Traversable t, Monad m) => t (m a) -> m (t a)
In any case this solution shows the power of Haskell abstractions.

The Pragmatic Solution

At this point, I'm not completely satisfied from the Haskell language:
  • the recursive solution is too time-consuming and difficult to obtain
  • the advanced solution is very short, but too deep for my taste
Is there a better way for solving the problem?

We have a combinatorial problem. For sure an iconic language for expressing and solving combinatorial problems is Prolog. But Haskell supports very well Domain Specific Languages (DSL), and so I can use a Prolog-like DSL language also in Haskell. In this case the MonadPlus applied to lists, is the right instance to use. Note that in case of very complex combinatorial problems, the best solution is probably using high-performance packages like LogicGrowsOnTree, but in this case it is overkill.

combineWords'''' :: [String] -> [String]
combineWords'''' sets = combine [] sets
 where
   combine decidedPart [] = return decidedPart
   combine decidedPart (set1:sets) = do
     c <- set1
     combine (decidedPart ++ [c]) sets
isOk5 = test1 == combineWords'''' [set1, set2, set3]
  > isOk5
  True
Don't ask me why, but this version of the function was very fast and natural to write for me, and very readable. It is based on a simple constraint, and only one recursive call. Probably because a single recursion is easier to understand than the dual recursion version, and because I were accustomed to Prolog.

Also the MonadPlus version of the first version of the problem is very natural to write

allWordsOf3Chars'''' :: [Char] -> [Char] -> [Char] -> [String]
allWordsOf3Chars'''' set1 set2 set3 = do
  c1 <- set1
  c2 <- set2
  c3 <- set3
  return [c1, c2, c3]
isOk6 = test1 == allWordsOf3Chars'''' set1 set2 set3
  > isOk6
  True

Conclusions

If some problems are hard to solve in Haskell, probably there exists some DSL able to express and solve them in a more natural way. The advantage of studying a new DSL, it is that after the initial investment of time, all problems in the same domain can be solved faster.

Probably there is also a payoff in studying better Applicative, Traversable, and sequence function, but for this problem, the MonadPlus solution seems to me a good balance between simplicity and terseness of the code.

No comments:

Post a Comment