I was browsing the FIPS 180-2 definition of SHA, and one of the examples they give is the hash of 1 000 000 as. I thought this would be a great example of laziness in Haskell. (replicate 1000000 'a')
acts as a lazy producer of 'a'
s, and sha1
from the crypto library acts as a consumer. sha1 (replicate 1000000 'a')
should run beautifully in constant space. But that isn’t what happened.
Prelude Data.Digest.SHA1Aux> sha1 (replicate 1000000 'a') "*** Exception: stack overflow
Oh, what could have gone wrong? The problem turns out to be with the preprocessing of the input string. SHA requires the length of the input string to be appended to the input. To simplify a little, preprocessing is defined to be something like
preprocessing :: [Char] -> [Char] preprocessing x = x++(show (length x))
Of course, this isn’t how it is implemented. In the above implementation, as the input gets consumes, the expansion of (replicate 1000000 'a')
is keep because it is needed for the (length x)
to be evaluated later. Keeping this million element list around is no good.
Instead the code looks something like
preprocessing x = helper x 0 where helper (x:xs) n = x:(helper xs (n+1)) helper [] n = (show n)
Unfortunately this doesn’t work either. What happens is that as the processed list is consumes an unevaluated function for n
is built up becoming 0+1+1+1+…
. This expression has a million nodes before it is evaluated. The stack blows up when it is evaluated (not to mention all the memory consumed before that).
So that is no good either. What is the solution? Apparently a little strictness is in order.
preprocessing x = helper x 0 where helper (x:xs) n = x:(helper xs $! (n+1)) helper [] n = (show n)
Here the $!
says to evaluate (n+1)
right away. This stops the million node thunk from building up.
I find this a bit strange; to get the nice constant space results of lazy evaluation that I expect, I need to make it more strict. Of course too much strictness is no good. In a language like ML,
(replicate 1000000 'a')
would be evaluated right away and take up a million nodes. Laziness would have to be emulated with coroutines or callCC, or something like that.
This algorithm seems to really require half lazy, half strict computation.