Tuesday, July 25, 2017

The Distributed Internet/Distro Dream

My dream is an internet where all static content is distributed using an automatic caching protocol like IPFS, and where (Linux) distributions use Nix-inspired packages.

In my dream Internet Service Providers (ISP) will install IPFS servers on their POPs, and every local computer will be an IPFS node, maybe caching content only locally. Doing so:

  • files will be downloaded only one time, and then cached locally
  • copy of files will be downloaded from the nearest networks
  • static content will remain available also if the reference server is down
  • etc..

In my dreams Linux will have a standard API for managing packages in a Nix-like way, instead of global conflicting packages:

  • every application will be developed and built as a distinct package
  • different versions of the same package would cohexist on the same system  
  • the build system will have a standard way for expressing relationships between the different used packages
  • every target of a build will be a package
  • every distribution of an application, will be a distribution of its package with related dependencies
  • every programming language environment will use the API instead of implementing a custom repository of libraries
  • every package can be customized using an object-oriented like approach: i.e. expressing only the differences respect the upstream/reference code and build configuration
  • every package can be a set of packages, and so it can represent a distributions or other high-level concepts
  • there can be explicit hierarchical chains of maintainers from end-user distro to upstream
  • donations, and contract supports can be shared semi-automatically between maintainers
  • etc..
My dream is a world where things can be shared and modified efficiently.

Wednesday, July 5, 2017

Understanding Traversable

I’m a poor object-oriented programmer, not versed to category theory and advanced Haskell. I understand Functor, Applicative, Foldable, but what is the sense behind Traversable?


class (Functor t, Foldable t) => Traversable t where

   -- | Map each element of a structure to an action, evaluate these actions
   -- from left to right, and collect the results. For a version that ignores
   -- the results see 'Data.Foldable.traverse_'.
   traverse  :: Applicative f => (a -> f b) -> t a -> f (t b)
   traverse f = sequenceA . fmap f

   -- | Evaluate each action in the structure from left to right, and
   -- and collect the results. For a version that ignores the results
   -- see 'Data.Foldable.sequenceA_'.  
   sequenceA :: Applicative f => t (f a) -> f (t a)
   sequenceA = traverse id

In particular what is the meaning of traverse f = sequenceA . fmap f?

TL;DR

For beginner Haskell programmers specific class instances are more understandable and useful than abstract class definitions. It is better understanding and using the IO Monad, or the Maybe Monad, than knowing perfectly the theory behind the generic Monad class definition. For the same reasons, it is better understanding different instances of Traversable class, than the abstract theory behind it.

Traverse a List applying Maybe semantic

Traversable uses functions with two generic types: f that is an Applicative context, t that is a Foldable Functor.
We will start with an example using Maybe for the applicative part (the f), and List for the Functor part (the t).
Maybe has a simple and clear Applicative semantic: stop the computation and return Nothing when one of intermediate passages returns Nothing.
List has a simple Functor semantic: fmap applies a function to every element of the list.

sequenceA

The sequenceA function became
   sequenceA :: [Maybe a] -> Maybe [a]
Studying type definition, the semantic is clear: extract Just from the elements of the list, and if there is any Nothing element, return Nothing instead of the list.

{-# LANGUAGE ScopedTypeVariables #-}

module Main where

import Data.Traversable
import Control.Applicative
import Data.List as L
-- | All the tests of the code assertions.
main = putStrLn $ show $ L.all id [
           mtest1
        ,  mtest2
        ,  mtest1M
        ,  mtest2M
        ,  mtest3
        ,  mtest4
        ,  mtest3M
        ,  mtest4M
        ,  mtest4M'
        ,  stest1
        ,  stest2
        ,  stest3
        ,  ltest1
        ,  ltest2
        ]

We will remove Just from the list:

mf1 :: Maybe [Int]
mf1 = sequenceA [Just 1, Just 2]

mtest1 :: Bool
mtest1 = (mf1 == Just [1, 2])

If we pass a Nothing value, then the entire result became Nothing:

mf2 :: Maybe [Int]
mf2 = sequenceA [Just 1, Just 2, Nothing]

mtest2 :: Bool
mtest2 = (mf2 == Nothing)

We can rewrite these functions using the do-notation:

mf1M :: Maybe [Int]
mf1M = do
  x <- return 1
  y <- return 2
  return [x, y]

mtest1M = (mf1 == mf1M)

mf2M :: Maybe [Int]
mf2M = do
  x <- return 1
  y <- return 2
  z <- empty
  return [x, y, z]

mtest2M = (mf2 == mf2M)

So what is the semantic of sequenceA for this specific instance of Traversable? The semantic is self-explanatory studying its type sequenceA :: [Maybe a] -> Maybe [a]. But sadly for us, we can not generalize it, as we will see in next sections.

traverse

The traverse function became:

   traverse   :: (a -> Maybe b) -> [a] -> Maybe [b]

For playing with traverse, we need a function returning Maybe:

mf :: Int -> Maybe Int
mf x = if (even x) then Just x else Nothing

and then:

mf3 :: Maybe [Int]
mf3 = traverse mf [2,4]
mtest3 = (mf3 == Just [2,4])

All elements of the list are even, and so the same list without modifications is returned.
If we insert a not even element, then Nothing is returned:

mf4 :: Maybe [Int]
mf4 = traverse mf [2,4,5]
mtest4 = (mf4 == Nothing)

As usual we can rewrite using the do notation

mf3M :: Maybe [Int]
mf3M = do
  x <- mf 2
  y <- mf 4
  return [x, y]

mtest3M = (mf3M == mf3)

mf4M :: Maybe [Int]
mf4M = do
  x <- mf 2
  y <- mf 4
  z <- mf 5
  return [x, y, z]

mtest4M = (mf4M == mf4)

Traversable defines also mapM that is simply traverse. We can rewrite in this way:

mf4M' :: Maybe [Int]
mf4M' = do
  r <- mapM mf [2, 4, 5]
  return r

mtest4M' = (mf4M' == mf4M)

In this case, the traverse function captures the well known concept of mapM inside the Maybe Applicative.

Traverse a List applying List semantic

Now we will use an instance of Traversable with List both as container (for t), and as Applicative (for f).
The List applicative behavior is similar to Prolog: it combines all possible combinations of generators, filtering on constraints.
The List functor behavior is the usual map: it applies a function to every element of the list.

sequenceA

In this case, we have:

   sequenceA :: [[a]] -> [[a]]
   sequenceA = traverse id

where

   traverse  :: (a -> [b]) -> [a] -> [[b]]
   traverse f = List.foldr cons_f (pure [])
     where consF x ys = (:) <$> f x <*> ys

Due to specific implementation of traverse for List, sequenceA became a combinatoric function performing a “transpose-like” operation, combining columns with lines:

stest1 = sequenceA [[1,2,3], [4,5]] == [[1,4],[1,5],[2,4],[2,5],[3,4],[3,5]]

The corresponding function defined using Prolog-like semantic is

transposeAndCombine :: [[a]] -> [[a]]
transposeAndCombine linesAndCols = tc [] linesAndCols
 where
  tc :: [a] -> [[a]] -> [[a]]
  tc r1 [] = return r1
  tc r1 (xs:rs) = do
    x <- xs
    tc (r1 ++ [x]) rs

stest3 = let l = [[1,2,3], [4,5]]
         in  transposeAndCombine l == sequenceA l

In case of Maybe the Nothing value invalidates all the computations. In case of List the value invalidating all computations is []:

stest2 = sequenceA [[1,2,3], [4,5], []] == []

In this case the sequenceA function has a rather useful and reusable behaviour: transpose and combine columns with lines. Knowing this behavior in advance, the sequenceA function can be called directly, without using the do-notation form that is less clear.
But this behavior is very different from the Traversable instance with Maybe and Applicative. So the Traversable class does not help us in predicting the sequenceA semantic. We had to study it case by case.

traverse

traverse became:

   traverse  :: (a -> [b]) -> [a] -> [[b]]
   traverse f = List.foldr cons_f (pure [])
     where consF x ys = (:) <$> f x <*> ys

If we play with traverse, we obtain:

lf :: Int -> [Int]
lf x = [x * 10, x * 100]
lxs :: [Int]
lxs = [1, 2]
lf1 :: [[Int]]
lf1 = traverse lf lxs
lfxs = [[10, 20], [10, 200], [100, 20], [100, 200]]
ltest1 = (lf1 == lfxs)

The semantic using Prolog-like rules is not immediate. A first but bad version is:

lf1M' :: [Int]
lf1M' = do
  x <- lxs
  y <- lf x
  return y

ltest1M' = (lf1M' == [10, 20, 100, 200])

It isn’t correct because it combines too few things.
The behavior of traverse is: transposeAndCombine the results of the function applications with the list of possible arguments. The corresponding code is:

lf1AsTransf :: [[Int]]
lf1AsTransf = transposeAndCombine (map lf lxs)
ltest2 = (lf1AsTransf == lfxs)

In this form the code is clear, so we don’t derive a Prolog-like version.
In this case traverse has not a basic and natural semantic. Probably there are not much cases in real-life code, where we want such strange and extreme combinatoric behavior. Probably we are more interested to the behavior of functions like lfm1M', expressed with the do-notation.

Conclusions

After these examples, we can return to our original question: what is the meaning of traverse f = sequenceA . fmap f? My lazy and arrogant answer is: I don’t bother! :-)
The motivations are:
  • also if I can grasp the concepts behind Traversable, it will not help in understanding real-life code using Traversable, because every instance of traverse and sequenceA has a very different and specific semantic.
  • so I must study each instance of Traversable in isolation, for understanding its behavior. This is similar to IO Monad, Maybe Monad, Either Monad: knowing the Monad concepts helps, but every instance has its proper semantic and usage case, and it must be mastered apart.
  • maybe instance by instance, I can someday comprehend the concepts behind Traversable, but up to date this can be postponed, because it seems more complex to master respect Functor and other base class.
  • in the end I’m a poor OO programmer, not a mathematician expert of category-theory.
In Haskell there are very few class, and they are usually very abstract and tied to category theory concepts: Monad, Functor, Applicative, etc.. The laws of these class allows for nice combinations of instances. This is a sign of the power and effectiveness of category-theory. But these class are mainly used for writing composable DSLs or nice libraries. End-user code uses instead specific instances. Each instance has a specific semantic that can not be predicted only from the knowledge of the reference class. This is in contrast with the OO world where a parent class describes exactly the behavior of a child class. For example Monad is a class describing nice properties for composable actions. But the semantic of a List Monad (Prolog-like rules) is very different from the semantic of an Either Monad (actions with exceptions).
So Traversable can represent many different things in Haskell. Also if I don’t understand completely its meaning, because it is too much abstract and tied to category-theory universe, this does not prevent me from studying and comprehending perfectly its specific instances, and using them in end-user code.