Sunday, November 20, 2016

Combinatorial Problems in Haskell

A Simple Exercise

"Given three sets of characters, build all the possible words of 3 characters, with the first character from the first set, the second character from the second set, and the last character from the last set."

This is an exercise adapted from a similar one on the Haskell Programming from first principles book, and the results are part of the work done during the Haskell Day with my group of Haskellers!

{-# LANGUAGE OverloadedStrings #-}

module Combinatorial where


set1 :: [Char]
set1 = "ab"

set2 :: [Char]
set2 = "12"

set3 :: [Char]
set3 = set1

The Recursive Solution

In a classical imperative solution we would write three nested loops trying all combinations. In Haskell we can replace loops with recursion.

allWordsOf3Chars :: [Char] -> [Char] -> [Char] -> [String]
allWordsOf3Chars set1 set2 set3 = combine1 set1 set2 set3
 where

   combine1 [] _ _ = []
   combine1 (c1:cs) set2 set3 = combine2 c1 set2 set3 ++ combine1 cs set2 set3

   combine2 _ [] _ = []
   combine2 c1 (c2:cs) set3 = combine3 c1 c2 set3 ++ combine2 c1 cs set3

   combine3 _ _ [] = []
   combine3 c1 c2 (c3:cs) = [[c1, c2, c3]] ++ combine3 c1 c2 cs

We test it

test1 = allWordsOf3Chars set1 set2 set3

  > test1 
  ["a1a","a1b","a2a","a2b","b1a","b1b","b2a","b2b"]

The KISS Solution

In Haskell list comprehension has a combinatorial semantic, and so we can write simply

allWordsOf3Chars' set1 set2 set3
  = [ [c1, c2, c3] | c1 <- set1, c2 <- set2, c3 <- set3 ]

isOk1 = test1 == (allWordsOf3Chars' set1 set2 set3)

  > isOk1
  True

The KISS solution is very easy to read, and very fast to write. A good selling point for the Haskell language.

The Advanced Solution

In Haskell the Applicative instance of List, has a combinatorial semantic, so we can write

allWordsOf3Chars'' set1 set2 set3
  = (\c1 c2 c3 -> [c1, c2, c3] ) <$> set1 <*> set2 <*> set3

isOk2 = test1 == (allWordsOf3Chars'' set1 set2 set3)

  > isOk2
  True

The Extended Exercise

We can generalize the first problem, passing a list of sets of chars, instead of only 3 sets.

The Extended Recursive Solution

In an imperative language, generalizing the solution is not simple, because we have a number of loops that is not fixed.

Generalizing the Haskell recursive solution is apparently simpler, because we start already from a recursion, but in practice it is rather hard to figure out the right algorithm.

combineWords :: [String] -> [String]
combineWords sets = combine [] sets 

 where

   combine :: String -> [String] -> [String]
   combine decidedPart [] = [decidedPart]
   combine decidedPart (set1:sets) = choices decidedPart set1 sets 

   choices :: String -> String -> [String] -> [String]
   choices decidedPart currentSet otherSets =
     concatMap (\c -> combine (decidedPart ++ [c]) otherSets) currentSet

isOk3 = test1 == (combineWords [set1, set2, set3])

  > isOk3
  True

This algo works because:

it has a main loop on each set
the inner loop selecting one character from the set, calls again the main loop function for obtaining all the possible solutions with the remaining sets

It is a double recursion , and double recursion is often hard to understand.

Writing this algo required too much time respect the initial planned time, and I can not declare that writing it in Haskell is a very easy task.

The extended KISS Solution

Apparently the previous KISS solution based on list comprehension, can not be extended easily to the new version of the problem.

The Extended Advanced Solution

I were in a meetup, so I asked help to someone more knowledgeable than me, and his solution was shocking

combineWords''' :: [String] -> [String]
combineWords''' sets = sequence sets

isOk4 = test1 == (combineWords''' [set1, set2, set3])

  > isOk4
  True

I'm not completely sure of the reason this algo works. I need to study better Haskell, and the magic behind sequence.

  >:t sequence
  sequence :: (Traversable t, Monad m) => t (m a) -> m (t a)

In any case this solution shows the power of Haskell abstractions.

The Pragmatic Solution

At this point, I'm not completely satisfied from the Haskell language:

the recursive solution is too time-consuming and difficult to obtain
the advanced solution is very short, but too deep for my taste

Is there a better way for solving the problem?

We have a combinatorial problem. For sure an iconic language for expressing and solving combinatorial problems is Prolog. But Haskell supports very well Domain Specific Languages (DSL), and so I can use a Prolog-like DSL language also in Haskell. In this case the MonadPlus applied to lists, is the right instance to use. Note that in case of very complex combinatorial problems, the best solution is probably using high-performance packages like LogicGrowsOnTree, but in this case it is overkill.

combineWords'''' :: [String] -> [String]
combineWords'''' sets = combine [] sets
 where
   combine decidedPart [] = return decidedPart
   combine decidedPart (set1:sets) = do
     c <- set1
     combine (decidedPart ++ [c]) sets

isOk5 = test1 == combineWords'''' [set1, set2, set3]

  > isOk5
  True

Don't ask me why, but this version of the function was very fast and natural to write for me, and very readable. It is based on a simple constraint, and only one recursive call. Probably because a single recursion is easier to understand than the dual recursion version, and because I were accustomed to Prolog.

Also the MonadPlus version of the first version of the problem is very natural to write

allWordsOf3Chars'''' :: [Char] -> [Char] -> [Char] -> [String]
allWordsOf3Chars'''' set1 set2 set3 = do
  c1 <- set1
  c2 <- set2
  c3 <- set3
  return [c1, c2, c3]

isOk6 = test1 == allWordsOf3Chars'''' set1 set2 set3

  > isOk6
  True

Conclusions

If some problems are hard to solve in Haskell, probably there exists some DSL able to express and solve them in a more natural way. The advantage of studying a new DSL, it is that after the initial investment of time, all problems in the same domain can be solved faster.

Probably there is also a payoff in studying better Applicative, Traversable, and sequence function, but for this problem, the MonadPlus solution seems to me a good balance between simplicity and terseness of the code.

Friday, November 18, 2016

OSS Licenses Debugging

New Conclusions

After this discussion on one of the OSI mailing lists, now I think that MIT, BSD, and ISC are true permissive licences. As usual I'm not a lawyer, and there were no judicial deliberation, so take these notes at your own risk.

Suppose that:

A is product released under MIT, BSD or ISC license
B is a new/derived product using A source code, but released under different license terms
BB is the author of B

From what I understood, if B product cite/credit A copyright holders, and A original license, then thanks to very permissive A license, B product can use A source code inside it. Then according Berne Convention, BB can release B under the license terms he prefers, because he is the copyright holder of a new/derived product B, and not a distributor of A.

Obviously these passages are only needed if BB wants to "release A" under a different license. Otherwise BB can simply add himself to the list of A contributors/copyright holders, and BB can be a normal distributor/contributor of A product.

I will maintain the old but incorrect post, because maybe it can be useful also to others having similar problems in interpreting these licenses. The source of the problem was in the interpretation of terms like "reproduce", "retain" that are to be considered like a citation/credit of the original product license, and not as an inclusion of the original license requirements also into the license of the new released product. 

Old Post Content, with Logical Errors Inside

Abstract

Despite the common wisdom, many permissive licenses like BSD, MIT, ISC are not easy to understand and apply correctly, and theirs terms are ambiguous. Apache-2.0 license is instead a lot more clear and it should be favored.

BSD License

I had to release a short piece of code, using a simple and permissive OSS license. According Wikipedia

BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the redistribution of covered software.

In particular BSD can be used in commercial and in GPL software:

The BSD License allows proprietary use and allows the software released under the license to be incorporated into proprietary products. Works based on the material may be released under a proprietary license as closed source software, allowing usual commercial usages under.

Perfect!!

Then for sake of security I studied the license text, and I found that it is far from easy to understand!! This is the BSD-2 license template:

Copyright (c) <year>, <copyright holder>
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Suppose B is a product released under the BSD-2 license, and that there is a commercial product C including parts of B source code, and released in compiled form. The license of C can be something like

Copyright (c) 2016, Acme Inc.
All rights reserved.

You can use and distribute this software only if you own a 
valid commercial license from Acme Corporation Inc.

Is C respecting the BSD-2 license of B? In this form no, because the license of B specifies

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: [..] 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

So if we not met condition 2, we can not use B into C.

If we follow precisly these instructions, then C binary application should print a license like this

Copyright (c) 2016, Acme Inc.
All rights reserved.

You can use and distribuite this software only if you own a 
valid commercial license from Acme Corporation Inc.

This software uses modified parts of B software, and according its license

Copyright (c) <year>, <copyright holder>
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

So C license is saying that C must be distributed according the BSD-2 license, and not according a commercial license, because we are stating that the receiver of C can redistribuite it rather freely.

Maybe the first part of C license (the commercial part), is stronger than the second part based on BSD license (the permissive part). But this is not possible because if we do not respect the B license, we can not modify and redistribute it in C.

Maybe the BSD license is referring only to the original B source code, while C source code can be released under a commercial license. But what is the point saying that the B part in a binary product C can be redistributed freely, if we can not separate B from C? Without further specifications the license is referring to the entire C product, result of the modification of B source code.

Worse, if we read the BSD license it does not say in any point that we can relicense the code under other licenses. It says to copy (then propagate) the same BSD license of the B product, also to the derived C product.

More you try to comply precisly with the BSD license, and more it seems a viral license like the GPL.

What is the solution? In my opininion, after a discussion in the Haskell-ITA IRC, the solution is in interpreting the statements of the BSD license with the maximum precision.

The BSD license is composed of these parts:

copyright information
paragraph stating that some conditions must be met, i.e. the part Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
list of conditions
disclaimer of responsibility

The condition in part 3 says a precise thing

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

It is saying, in a subtle way, that you must include parts 1, 3 and 4, but not part 2. So the C license must be something like

Copyright (c) 2016, Acme Inc.
All rights reserved.

You can use and distribuite this software only if you own a 
valid commercial license from Acme Corporation Inc.

This software uses modified parts of B software, with copyrights:

Copyright (c) <year>, <copyright holder>
All rights reserved.

According B software license:

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

In this form the license makes sense, and it enforces only the citation of original authors, and the fact that they are not responsible for damages. Apart this aspect, the license is for sure very permissive. It is also fair from an accademic point of view, because work attribution is never lost.

Note that from a certain point of view the BSD license is "viral" like the GPL license, in the sense that also a product D based on C source code, can use C only if it mentions the original authors of B and only if D includes in the license the same obligations to cite B authors in all derived works. So you can never loose the attribution of the work of B, if B is released under the BSD license.

The same conclusions apply also in case product C is distributed as source code. In this case we apply the condition 1. instead of 2.

The strange facts are that:

it is not true that the BSD license is easy to understand, because the exact requirements to met are very subtle, and probably they can be stated in a more explicit form from the license
also authoritative web sites of important software companies using BSD source code in their products do not provide clear answers
the Free Software Foundation web site confirms that the BSD license is compatible with the GPL license, but there is no a clear draft of how to comply to it

Applying a software license is like executing the instructions of a program: if you follow precisley the statements, then the license effects should be clear and unambiguous. In case of BSD the instructions were not so clear, but if we try to test and "debug" also other permissive licenses, there are worse problems.

ISC License

This is the text of the ISC license:

Copyright (c) 4-digit year, Company or Person's Name
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

If we apply this license to a commercial product C, then C final license should say that

Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

So if you receive a copy of C you can distribute it freely. It seems that ISC license is viral in the sense of the GPL, and not of the BSD license.

MIT License

The MIT license says

Copyright (c)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

If we apply it in a precise way, then the license is saying:

you can do whatever you want (modify, merge, publish, etc..)
you can do whatever you want, only if you include in the final product a license saying that the receiver can do whatever he want, whenever he include this same license in his derivate products, and so on at infinitum for every derived product

So it seems a viral license like the GPL, because the same permissive rights are transferred also to the receiver of the derived work.

WTF License

WTF license says

             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 
                    Version 2, December 2004 

 Copyright (C) 2004 Sam Hocevar <sam@hocevar.net> 

 Everyone is permitted to copy and distribute verbatim or modified 
 copies of this license document, and changing it is allowed as long 
 as the name is changed. 

            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 

  0. You just DO WHAT THE FUCK YOU WANT TO.

The license says that you have maximal freedom for "copying, distribution and modification" of something that is not specified. Implicitely it should be the project/software/file released under the WTF license.

The license in the beginning says also that everyone can copy, distribute verbatim or modified copies of the license document, and you can change the license only if you change the name of the license. In this case the subject of the discourse is not the project/software/file released under the WTF license, but the license itself, called "license document". "License document" for an header to put in every source file sounds a little strange, but it is acceptable.

But it is not so acceptable that in the next sentence the subject of the changes is not anymore the "license document", but the software project, or the file, but it remains unspecified.

So WTF is simple, but a little confusing if applied in a pedantic way. In any case it is for sure a liberal license.

Apache-2.0 License

Apache-2.0 license is a long license, but it is evident that a lot of care is put in it, because it defines clearly the terms used. In the first part it can be followed precisley without problems. Then in the last parts it says

Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:

ok...

You must give any other recipients of the Work or Derivative Works a copy of this License; and

Note that it is not saying that you must apply the Apache license also to derivative works, but only that you must include a copy of this license as reference. This is sound, because if you cite in product C that you are using a product B with Apache license, then you had to include a complete copy of the license as reference. Important also because the license speaks also about patents litigations.

You must cause any modified files to carry prominent notices stating that You changed the files;

Ok clear. If you modify a source code file, you must add you as author to it in the header. It is considered best-practice also in case of the BSD license, but in this case it is stated clearly. So for every source code file (also if kept private) you have the complete list of authors.

and You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works;

Ok rather clear.

and If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.

Ok you must keep the list of original authors like BSD, but also the notices they put in NOTICE file, in case there is one.

You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.

Ok you can add additional license conditions to the derived work C, so C can became a commercial product. Note that:

the Apache license is saying explicitely that you can add additional license terms and conditions to a derived product C, while MIT and BSD license are not saying explicitely this
the Apache license is not saying that the receiver of the product C has all the rights of Apache license also on product C, so Apache license is not viral

We can follow the Apache license precisley, without bad surprises, because it uses clear and direct sentences. Also if it is a license with a long text, in practice on every source code file only a small header must be put, and the complete license is only put in the shipped LICENSE file. So there is no cost for this additional precision.

Old Conclusions (with logical errors inside)

Applying a software license is like executing the instructions of a program: if you follow precisley the statements, then the license effects should be clear and unambiguous.

In my humble opinion, if I follow the ISC and MIT license precisely, then they have a viral behavior like the GPL, so they are not permissive. Probably I'm making some mistake in their interpretation, because all the world is saying that they are very permissive. But if they are permissive, then I doubt that their text and interpretation is clear and simple, and I would be no happy to defend them in a court.

I think that the BSD-2 is a permissive license, but only interpreting the license text in a rather convoluted way. So also in case of BSD I would be no happy to defend it in a court.

Apache 2.0 is a permissive license, and it has by far the most clear and comprehensible text, also if it is not short. So it should be used in all OSS projects needing a permissive license.

All this, obviously in my humble opinion, because I'm not a lawyer.

Wednesday, November 16, 2016

Elm is a Funnny Language

I used Elm (a functional programming language for writing webapps), in a moderately complex project, and I enjoyed it.

Elm has a syntax similar to Haskell, but its semantic is much simpler because it has no classes.

This simplicity is a selling point of Elm, because I see Elm more like a replacement/competitor of JavaScript and its overwhelming ecosystem of libraries, than a competitor of Haskell. Elm semantic is more elegant and predictable than JavaScript, and it has few and more integrated libraries. So the passage from JavaScript to Elm has a lot of advantages from my point of view, and very few disadvantages.

On the contrary Elm can not replace a feature-rich and complex Haskell code-base, because Haskell is based on generic and reusable abstractions like Monoid, Functor, Monad, Traversable, but Elm can not support them without the class concept. Elm sticks to clasical functional programming: no classes, no ad-hoc polymorphism, but only generic functions like map, filter and so on.

Haskell is a general-purpose, multi-paradigm programming language, and you can choose between different type of architectures: e.g. functional reactive programming (FRP), software transactional memory (STM), applicative or monadic domain specific languages (DSL).

On the contrary Elm is laser focused on single-page webapps. For this domain, it enforces a common and fixed architecture, based on a Model, updated by asynchronous Commands and Tasks. After the changes, the view function converts the Model into elements of the user-interface. The Elm platform takes care of updating the user interface in an efficient and incremental way, without recomputing it from scratch, when small parts of the Model are changed. So in an Elm application you have only readable generative code, without low level details about incremental updates.

When you study an Elm application, you know in advance its architecture. This is an advantage respect Haskell (less different programming paradigms) and JavaScript (few and more integrated libraries).

This is an example of Elm code, that converts an ActiveExplosion value of the model, into an Svg value of the user-interface. In particular to a fading circle.

drawExplosion : Model -> Float -> ActiveExplosion -> Svg Msg
drawExplosion model activationTime e =
    let opacity = Basics.max 0.0 (1.0 - (currentTime - activationTime) / (e.deactivationTime - activationTime))
    in  Svg.circle [ cx (toString e.centerX)
                   , cy (toString e.centerY)
                   , r (toString 45.0)
                   , fill (model_fromColorToExplosionGradientId model e.color)
                   , fillOpacity (toString opacity)
                   ] []

Elm functional code can be used also for generating HTML and SVG documents. This is an Elm expression generating the NetRobots logo:

Svg.svg 
  [preserveAspectRatio "xMinYMin meet"
  , viewBox "0 0 350 75"
  , width "100%"
  ,  height "100%"]
  [ g [ Svg.Attributes.style "overflow:hidden; text-anchor: middle; font-size:45; font-weight: bold; font-family: Impact"]
      [text' 
         [ x "175"
         , y "55"
         , Svg.Attributes.style "fill: white; stroke: #0f9; stroke-width: 14"]
         [Svg.text "NetRobots"]
      , text' 
          [ x "175"
          , y "55"
          , Svg.Attributes.style "fill: white; stroke: #99f; stroke-width: 8"] 
          [Svg.text "NetRobots"]
      , text' 
          [ x "175"
          , y "55"
          , Svg.Attributes.style "fill: white; stroke: black; stroke-width: 2"] 
          [Svg.text "NetRobots"]
     ]
  ]

This is the generated SVG document:

<svg preserveAspectRatio="xMinYMin meet" viewBox="0 0 350 75" width="100%" height="100%">
  <g style="overflow:hidden; text-anchor: middle; font-size:45; font-weight: bold; font-family: Impact">
    <text x="175" y="55" style="fill: white; stroke: #0f9; stroke-width: 14">NetRobots</text>
    <text x="175" y="55" style="fill: white; stroke: #99f; stroke-width: 8">NetRobots</text>
    <text x="175" y="55" style="fill: white; stroke: black; stroke-width: 2">NetRobots</text>
  </g>
</svg>

Elm expressions can be combined together, for factoring out repeated patterns, like in this case:

let netRobotsText strokeColor strokeWidth = 
      text' 
        [ x "175"
        , y "55"
        , Svg.Attributes.style ("fill: white; stroke: " ++ strokeColor ++ "; stroke-width: " ++ strokeWidth")]
        [Svg.text "NetRobots"]

in Svg.svg 
     [preserveAspectRatio "xMinYMin meet"
     , viewBox "0 0 350 75"
     , width "100%"
     ,  height "100%"]
     [ g [ Svg.Attributes.style "overflow:hidden; text-anchor: middle; font-size:45; font-weight: bold; font-family: Impact"]
         [ netRobotsText "#0f9" "14"
         , netRobotsText "#99f" "8"
         , netRobotsText "black" "2"]
     ]

Summing up, Elm is useful only for writing single-page webapps, and it is:

simpler and more powerful than JavaScript
simpler but less powerful than Haskell

Saturday, October 22, 2016

RAII Programming in Haskell

I'm using Haskell in production for processing data files, and Data.Text.Lazy is a nice data type, because you can have both simple code managing text, and efficient run-time text processing, because text is loaded chunk by chunk from data streams. It is like a BufferedReader in Java.
But in GHC 8.0.1 some file reading functions do not behave correctly.

import qualified Data.Text.Lazy.IO as LazyText
import qualified Data.Text.Lazy as LazyText

getFileContent1 :: FilePath -> IO String
getFileContent1 fileName = do
  fileContent <- LazyText.readFile fileName
  return $ LazyText.unpack fileContent

-- NOTE: print the file content, reading it chunk by chunk by `fileName`
-- and writing it on `stdout` chunk by chunk.
-- So this simple code, has a nice run-time behaviour.
printFileContent1 fileName = do
  c <- getFileContent1 fileName
  putStrLn c

-- NOTE: this seems a correct function, 
-- but when executed it returns always an empty file content
getFileContent2 :: FilePath -> IO String
getFileContent2 fileName = do
  LazyText.withFile fileName ReadMode $ \handle -> do
    fileContent <- hGetContents handle
    return $ LazyText.unpack fileContent
  
-- NOTE: this code print nothing, due to error on `getFileContent2`
printFileContent2 fileName = do
  c <- getFileContent2 fileName
  putStrLn c

Data.Text.Lazy.IO.readFile is implemented in this way:

readFile :: FilePath -> IO Text
readFile name = openFile name ReadMode >>= hGetContents

Data.Text.Lazy.IO.hGetContents is a function returning the content of the handle chunk by chunk, and closing the handle when all the content is read.
System.IO.withFile is implemented in this way:

 withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r
 withFile name mode = bracket (openFile name mode) hClose

so the getFileContent2 code can be expanded to

 getFileContent3 fileName = do
    bracket (openFile fileName ReadMode) hClose $ \handle -> do
      fileContent <- hGetContents handle
      return $ LazyText.unpack fileContent

bracket is one of a series of resource managements functions and monads used for acquiring resources, and releasing them at the end of an action, in a predictable way, and not when the garbage collector arbitrarily decide it. bracket makes management of scarce resources like file handles, database connections, and so on more robust and predictable.
This code will run correctly

 printFileContent3 fileName = do
    bracket (openFile fileName ReadMode) hClose $ \handle -> do
      fileContent <- hGetContents handle
      putStrLn $ LazyText.unpack fileContent

because it will:

open the file
read it chunk by chunk, using hGetContents
print it chunk by chunk, using putStrLn
close the handle, thanks to bracket resource finalization action

The code printFileContent2 is not running correctly because:

bracket open the file
a lazy evaluation thunk LazyText.unpack <$> hGetContents handle is returned from the getFileContent2 function
bracket close the file handle before the thunk is evaluated

So from this point:

printFileContent2.putStrLn executed the thunk
hGetContents thunk tries to access a closed handle
hGetContents returns an empty content, instead of signaling with a run-time exception that the handle is closed

Then printFileContent2 assumes wrongly that the file is an empty file, without any compile time and run time error.
A test case for the bug is on https://github.com/massimo-zaniboni/ghc_lazy_file_content_error , and the bug was signaled to Ghc team.

RAII Programming in Haskell

Resource Acquisition is Initialization (RAII) is a tecnique for having predictable resource usages.
In Haskell, bracket should be RIIA compliant. This implies that bracket must always return the result in a strict way. In this way when the bracket action is called:

the resources are allocated,
the action is executed with maximum priority, and predictability,
the resources are deallocated,
the result is returned to the caller, completely evaluated, and no further processing involving the resources is required,

So the priority is always given to the called bracket action, and not to the caller action. The caller can be arbitrary complex, and long-running, so the called action had to complete its work with greater priority respect the caller, in order to release its allocated resources in a predictable way. In Haskel terms this signifies that when the caller decides it needs the result of the called code, it evaluates the called thunk completely, without intermediate unevaluated thunks. So called thunk evaluation is an atomic action, and it is not intervaled with the partial execution of the caller code.
This mechanism must be used also in case of nested bracket actions: the called actions must be executed in a strict way.

5 Whys

Why we have the hGetContents error? Because hGetContents is buggy. Why? Because bracket used inside withFile does not behave correctly with unevaluated thunks. Why? Because unevaluated thunks do not play nice with RIIA semantic. Because bracket should force a strict evaluation of the returned action, so the used resources are used completely and in a predictable way.
If bracket forces a strict evaluation of its result, then there will be no bug. This code run correctly

getFileContent4 :: FilePath -> IO String
getFileContent4 fileName = do
  fileContent <- LazyText.readFile fileName
  return $! LazyText.unpack fileContent
  -- NOTE: execute in a strict way, thanks to `$!`

-- NOTE: print the file content, reading it chunk by chunk by `fileName`
-- and writing it on `stdout` chunk by chunk.
-- So this simple code, has a nice run-time behaviour.
printFileContent4 fileName = do
  c <- getFileContent4 fileName
  putStrLn c

Friday, October 14, 2016

Idiomatic Haskell Applicative Code

Not Idiomatic Haskell Code

The many1 function code is the same of the Alternative.many function in the Haskell prelude. The scope of the function is applying zero, one or more times its applicative argument.

{-# LANGUAGE ApplicativeDo #-}

import Control.Applicative

class Alternative f => Alternative1 f where

   -- | Zero or more.
   many1 :: f a -> f [a]
   many1 v = many_v
     where
       many_v = some_v <|> pure []
       some_v = (fmap (:) v) <*> many_v

For me, the some_v inner function is hard to read/comprehend.

Idiomatic Haskell Code

We can rewrite some_v in this way

class Alternative f => Alternative2 f where

   -- | Zero or more.
   many2 :: f a -> f [a]
   many2 v = many_v
     where
       many_v = some_v <|> pure []
       some_v = (:) <$> v <*> many_v

some_v is rewritten using the Applicative idiomatic form:

we evaluate v and many_v using the specific f Applicative instance rules, and inside the applicative running context
we use the two results as arguments of the plain Haskell function (:)
we put the final result of (:) function in the f Functor

The majority of Applicative usage patterns follow this syntactic pattern, or its variation pure f <*> argument1 <*> argument2 <*> ... and after some accustomization, it can be comprehended intuitively without thinking too much to the type definitions, and to the details of involved functional combinators. This is the same for the code like case1 <|> case2, where we recognize immediately that <|> is separating two choices, and we understand the meaning without thinking too much to the underling details.

So we have improved the readability of the first version of the code using the common syntactic form (:) <$> ... <*> ..., instead of the less used fmap (:) <*> ... <*> ... form.

ApplicativeDo

This is a variant of many, but with ApplicativeDo syntax.

class Alternative f => Alternative3 f where

   -- | Zero or more.
   many3 :: f a -> f [a]
   many3 v = some_v <|> pure []

     where

       some_v = do
         v' <- v
         v'' <- (some_v <|> pure [])
         return (v':v'')

The ApplicativeDo notation helps understanding the semantic of some_v because all function arguments are explicit. But the code is a little more verbose respect the idiomatic Applicative pattern usage.

Functional Code

We can rewrite many using explicit types, and adding comments. The resulting code is too much verbose, and not intuitively readable, because there are too much low level details about types and functional combinators, hiding its true meaning.

class Alternative f => Alternative4 f where

   -- | Zero or more.
   many4 :: f a -> f [a]
   many4 v = many_v
     where
       -- TYPE: many_v :: f [a]
       many_v = some_v <|> pure []
       -- if `some_v` fails, return empty list,
       -- terminating `some_v/many_v` mutual recursion.

       -- TYPE: some_v :: f [a]
       some_v 
         = let -- TYPE: fa :: a -> ([a] -> [a])
               fa a = \as -> a:as
               -- `a` is the current result,
               -- to concatenate with next results,
               -- supplied in the [2] part.  

               -- TYPE: fm :: f ([a] -> [a])
               fm = fmap fa v
               -- create something nice to combine using
               -- `<*>` `Applicative` combinator,
               -- having as specific types:
               -- `(<*>) :: f ([a] -> [a]) -> f [a] -> f [a]`  
               -- `fmap :: ([a] -> [a]) -> f [a] -> f [a]

            in fm <*> many_v
               -- [2] part:
               -- concatenate the current result with the next results in `many_v`.
               -- This is a mutual recursion between `many_v` and `some_v`.
               -- At some point `many_v` will fail returning the empty list,
               -- and the recursion loop will stop.

Personal Conclusions

Haskell gives a lot of freedom in the way we can express things, but this freedom can harm code readability. Haskell code using Monads and Applicative instances, should follow few known and accepted idiomatic forms, because otherwise understanding the meaning of complex function combinations is not easy.

All complex function combinations in the end are only function calls, but human mind (at least my mind) does not reason easily with second order functions combined toghether in fancy ways. Instead we are able to recognize common syntactic forms like f <$> arg1 <*> arg2, associating immediately to them a semantic meaning, independently from the more complex low level details.

Thursday, September 29, 2016

Higher Harder Functions

Higher order functions combinators are at the base of Haskell power, but they can make the code harder to comprehend, so some counter measure should be adopted.

Do-Notation vs Applicative-Notation

This post is a literate Haskell document, so we must start with some boilerplate code.

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ApplicativeDo #-}
{-# LANGUAGE Arrows             #-}
{-# LANGUAGE DeriveDataTypeable #-}

module Main where

import Control.Applicative
import qualified Data.Attoparsec.Text.Lazy as A
import Data.Char
import Hakyll

main = do putStrLn "A main is mandatory, but not used in practice."

Attoparsec implements a domain specific language (DSL) for describing parsers. This is an example of function using Attoparsec, and the Applicative idiom.

-- | Parse an hexadecimal number like "%A0", and convert to the
--   the corresponding decimal number 10.
--   This function is written using the Applicative class style.
attoHex :: A.Parser Int
attoHex
  = toInt <$> (A.char '%' *> hexDigit) <*> hexDigit
 where

  hexDigit :: A.Parser Int
  hexDigit = (((-) (ord '0')) . ord) <$> A.choice [A.digit, A.satisfy (A.inClass "A-F")]

  toInt x y = x * 16 + y

What is nice in this code? For sure that there are no low level details about the state of the parser, and the code is only a specification of what we want to parse.

What is not nice in this code? The fact that a novice user must learn some idiomatic use of the Applicative class: toInt <$> ... *> ... <*>. Is this really necessary?

The same function, written using an ApplicativeDo form is:

attoHex' :: A.Parser Int
attoHex' = do
  A.char '%'
  x <- hexDigit
  y <- hexDigit
  return $ x * 16 + y
 where

  hexDigit :: A.Parser Int
  hexDigit = do
    d <- A.choice [A.digit, A.satisfy (A.inClass "A-F")]
    return $ (ord d) - (ord '0')

In this form the arguments of functions are explicit, and the code work-flow is a standard top-down processing. Also the internal function hexDigit is a lot more readable, using explicit arguments, instead of the point-free form.

Higher Order Functions

filter :: (a -> Bool) -> [a] -> [a] is an higher order function accepting another function as argument. Until there are simple functions like filter, it is easy to figure out the final meaning of an expression. When we start combining more complex functions it can become difficult.

Applicative class introduce some non trivial higher order function combinators, generalizing a lot of computation patterns involving parallel computations inside a certain context, with a final composition described by a pure function.

We redefine Maybe, for studying how Applicative instance is supported.

data NMaybe a = NJust a | NNothing

> class  Functor f  where
>   fmap        :: (a -> b) -> f a -> f b

This is the standard instance of Maybe, but defined for NMaybe.

instance  Functor NMaybe  where
  fmap _ NNothing       = NNothing
  fmap f (NJust a)      = NJust (f a)

Recap this:

> class Functor f => Applicative f where
>   pure :: a -> f a
>   (<*>) :: f (a -> b) -> f a -> f b

In this case the instance definition for NMaybe became

instance Applicative NMaybe where
  pure = NJust

pure = NJust is a simple definition, but pure x = NJust x is a little more explicit, because its parameters are explicits.

  NJust f <*> m = fmap f m

This definition is not immediate to understand, because the types are implicit. Types are also a form of documentation, so we try to expand the type of each part of the expression, using an invented notation:

assuming Applicative f
  <*> ::: ... -> NMaybe b
  -- we are applying the <*> function, and its result is `NMaybe b`.

    NJust ::: ... -> NMaybe (a -> b)
    -- this is the first argument of `<*>` parent function,
    -- with its explicit type.

      f ::: a -> b
      -- the argument of `NJust` parent function
      -- This is a "terminal" argument because, it is not an application of a function,
      -- and there are no `...` on its type annotation.

    m ::: NMaybe a
    -- this is the second argument.
    -- Note that the type of the complete type of the parent
    -- is derivable substituting to `...` the resulting types of its arguments,
    -- so in this case `NMaybe (a -> b) -> NMaybe a -> NMaybe b`

  =

  fmap ::: ... -> NMaybe b
    f ::: NMaybe (a -> b)
    m ::: NMaybe a
  -- the type is `NMaybe (a -> b) -> NMaybe a -> NMaybe b`
  -- and it is the same of the left definition.

This form is too much verbose, but it contains all the types, and we can understand the meaning of the expression only studying the types.

Homomorphism Law

This is law must be respcted from every instance of Applicative:

pure g <*> pure x = pure (g x)

The law expressed with explicit types is:

assuming Applicative f
  <*> ::: ... -> f b
    pure ::: ... -> f (a -> b)
      g ::: a -> b
    pure ::: ... -> f a
      x ::: a
  =
  pure ::: ... -> f b
    g ::: ... -> b
    x ::: a

Using a pseudo ApplicativeDo notation

do x' <- pure x
   return $ g x'
=
do g' <- pure $ g x
   return g'

Identity

This law says:

pure id <*> v = v

With explicit types:

assuming Applicative f
  (<*>) ::: ... -> f a
    pure id ::: f (a -> a)
    v ::: f a
  =
  v ::: f a

Using a pseudo ApplicativeDo:

do v' <- v
   return $ id v'
=
v

Composition

This law says:

pure (.) <*> u <*> v <*> w = u <*> (v <*> w)

Disambiguating the associativity (<*> is left associative) the law is

((pure (.) <*> u) <*> v) <*> w = u <*> (v <*> w)

The function composition function (.) is the classical

> (.) :: (b -> c) -> (a -> b) -> (a -> c)
> (.) f g = \a -> f (g a)

The form with explicit types is:

assuming Applicative f
  (<*>) ::: ... -> f c
    (<*>) ::: ... -> f (a -> c)
      (<*>) ::: ... -> f ((a -> b) -> a -> c)
        pure (.) ::: f ((b -> c) -> ((a -> b) -> a -> c))
        u ::: f (b -> c)
      v ::: f (a -> b)
    w ::: f a
  =
  <*> ::: ... -> f c
    u ::: f (b -> c)
    <*> ::: ... -> f b
      v ::: f (a -> b)
      w ::: f a

The pseudo ApplicativeDo is:

do u' :: b -> c
   u' <- u

   v' :: a -> b
   v' <- v

   w' :: a
   w' <- w

   return $ (u' . v') w'

=
do
   vw' :: b
   vw' <- do w' :: a
             w' <- w

             v' :: a -> b
             v' <- v

             return $ v' w'

   u' :: b -> c
   u' <- u

   return $ u' vw'

Lesson Learned

A point-free/applicative expression like

pure (.) <*> u <*> v <*> w = u <*> (v <*> w)

is hard to understand. Showing the intermediate types helps. Expressing it using a do-notation, with explicit names and types for each part of the expression, and with a top-down semantic helps further.

Nested Domain Specific Languages

This is a example of Hakyll code, for generating a static web site.

hakyllDemo :: IO ()
hakyllDemo = hakyllWith defaultConfiguration $ do
  -- Dot images
  match "images/*.dot" $ do
    route   $ setExtension "png"
    compile $ getResourceLBS >>= traverse (unixFilterLBS "dot" ["-Tpng"])

hakyllWith execute a Rules Monad. Rules is a DSL for generating pages, so apparently it is all very neat and elegant.

match, route, and compile returns Rules, so we have apparently only a series of statements of the Rules monad.

Studying the code there is a little surprise: compile accepts as parameter a Monad of type Compiler, and then it returns a value of type Rules. So the part after compile $ is a DSL written not in the Rules DSL, but in the Compiler DSL. But without studying the type of compiler there are no hints of this.

It is the same for route that accepts a description of routes using Routes DSL, and then return a Rules value, that can be embeded in the main hosting Rules Monad.

Because every Monad is a different DSL with a different semantic, the Haskell syntax should indicate better when we are using a certain type of Monad. So the code can be rewritten in a pseudo Haskell notation like this:

hakyllDemo' :: IO ()
hakyllDemo' = hakyllWith defaultConfiguration (:Rules: do
  match "images/*.dot" $ do
    route   (:Route: setExtension "png"))
    compile (::Compile: getResourceLBS >>= traverse (unixFilterLBS "dot" ["-Tpng"]))
)

Code Golfing

Haskell permits many variants for the same expression:

applicativeCall :: NMaybe Int -> NMaybe Int -> NMaybe Int
applicativeCall x y = pure (+) <*> x <*> y

applicativeCall' :: NMaybe Int -> NMaybe Int -> NMaybe Int
applicativeCall' x y = (+) <$> x <*> y

applicativeCall'' :: NMaybe Int -> NMaybe Int -> NMaybe Int
applicativeCall'' x y = liftA2 (+) x y

applicativeCall''' :: NMaybe Int -> NMaybe Int -> NMaybe Int
applicativeCall''' x y = do
  x' <- x
  y' <- y
  return $ x' + y'

In case of very small code fragments, the do notation seems too much verbose respect a liftA2 variant. But in case of real code, with longer functions, maybe a more predictable do notation should be preferred.

Conclusion

Programs written in Java are usually easy to read and comprehend “in the small”, because Java has a simple and coherent semantic. Maybe in the large they use a lot of complex patterns, and the general architecture of the application is not easy to understand, but in the small every fragment of code is readable.

On the contrary Haskell code can be very hard to comprehend in the small, because every Haskell expression can perform a lot of different things, depending from the context and the implicit types, so it must be studied carefully before understanding its real meaning.

Haskell should use a more uniform syntax and semantic, favouring:

more explicit types
do notation with explitic named parts, and clear top-down semantic, instead of clever function compositions
explicit indication of the Monad/Applicative in which the statements are executed

The ideal code should be readable from left to right, from top to bottom, without thinking too much at the low level details of the language semantic.