Data.Text.Lazy is a nice data type, because you can have both simple code managing text, and efficient run-time text processing, because text is loaded chunk by chunk from data streams. It is like a BufferedReader in Java.But in GHC 8.0.1 some file reading functions do not behave correctly.
import qualified Data.Text.Lazy.IO as LazyText
import qualified Data.Text.Lazy as LazyText
getFileContent1 :: FilePath -> IO String
getFileContent1 fileName = do
  fileContent <- LazyText.readFile fileName
  return $ LazyText.unpack fileContent
-- NOTE: print the file content, reading it chunk by chunk by `fileName`
-- and writing it on `stdout` chunk by chunk.
-- So this simple code, has a nice run-time behaviour.
printFileContent1 fileName = do
  c <- getFileContent1 fileName
  putStrLn c
-- NOTE: this seems a correct function, 
-- but when executed it returns always an empty file content
getFileContent2 :: FilePath -> IO String
getFileContent2 fileName = do
  LazyText.withFile fileName ReadMode $ \handle -> do
    fileContent <- hGetContents handle
    return $ LazyText.unpack fileContent
  
-- NOTE: this code print nothing, due to error on `getFileContent2`
printFileContent2 fileName = do
  c <- getFileContent2 fileName
  putStrLn cData.Text.Lazy.IO.readFile is implemented in this way:readFile :: FilePath -> IO Text
readFile name = openFile name ReadMode >>= hGetContentsData.Text.Lazy.IO.hGetContents is a function returning the content of the handle chunk by chunk, and closing the handle when all the content is read.System.IO.withFile is implemented in this way: withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r
 withFile name mode = bracket (openFile name mode) hClosegetFileContent2 code can be expanded to getFileContent3 fileName = do
    bracket (openFile fileName ReadMode) hClose $ \handle -> do
      fileContent <- hGetContents handle
      return $ LazyText.unpack fileContentbracket is one of a series of resource managements functions and monads used for acquiring resources, and releasing them at the end of an action, in a predictable way, and not when the garbage collector arbitrarily decide it. bracket makes management of scarce resources like file handles, database connections, and so on more robust and predictable.This code will run correctly
 printFileContent3 fileName = do
    bracket (openFile fileName ReadMode) hClose $ \handle -> do
      fileContent <- hGetContents handle
      putStrLn $ LazyText.unpack fileContent- open the file
- read it chunk by chunk, using hGetContents
- print it chunk by chunk, using putStrLn
- close the handle, thanks to bracketresource finalization action
printFileContent2 is not running correctly because:- bracketopen the file
- a lazy evaluation thunk LazyText.unpack <$> hGetContents handleis returned from thegetFileContent2function
- bracketclose the file handle before the thunk is evaluated
- printFileContent2.putStrLnexecuted the thunk
- hGetContentsthunk tries to access a closed handle
- hGetContentsreturns an empty content, instead of signaling with a run-time exception that the handle is closed
Then
printFileContent2 assumes wrongly that the file is an empty file, without any compile time and run time error.A test case for the bug is on https://github.com/massimo-zaniboni/ghc_lazy_file_content_error , and the bug was signaled to Ghc team.
RAII Programming in Haskell
Resource Acquisition is Initialization (RAII) is a tecnique for having predictable resource usages.In Haskell,
bracket should be RIIA compliant. This implies that bracket must always return the result in a strict way. In this way when the bracket action is called:- the resources are allocated,
- the action is executed with maximum priority, and predictability,
- the resources are deallocated,
- the result is returned to the caller, completely evaluated, and no further processing involving the resources is required,
This mechanism must be used also in case of nested bracket actions: the called actions must be executed in a strict way.
5 Whys
Why we have thehGetContents error? Because hGetContents is buggy. Why? Because bracket used inside withFile does not behave correctly with unevaluated thunks. Why? Because unevaluated thunks do not play nice with RIIA semantic. Because bracket should force a strict evaluation of the returned action, so the used resources are used completely and in a predictable way.If
bracket forces a strict evaluation of its result, then there will be no bug. This code run correctlygetFileContent4 :: FilePath -> IO String
getFileContent4 fileName = do
  fileContent <- LazyText.readFile fileName
  return $! LazyText.unpack fileContent
  -- NOTE: execute in a strict way, thanks to `$!`
-- NOTE: print the file content, reading it chunk by chunk by `fileName`
-- and writing it on `stdout` chunk by chunk.
-- So this simple code, has a nice run-time behaviour.
printFileContent4 fileName = do
  c <- getFileContent4 fileName
  putStrLn c