Some Random Thoughts of an Advanced Haskeller


Recently I was thinking about a programming problem that would need access to random values. I thought it might be fun to write up my though process as an advanced Haskeller while working through this particular problem.

In Haskell, one would write such a program by using a random monad to access an oracle providing random numbers. The traditional way to implement MonadRandom is using the state monad. The Gen type holds the state of the (pseudo-)random number generator, and the randomInt function returns a new random number and updates the state of the generator.

type Random a = State Gen a

randomInt :: Gen -> (Int,Gen)

randomOracle :: MonadRandom Int
randomOracle = state random

Then I write my program inside the Random monad, making calls to the randomOracle as needed

myProg :: Random Result
myProg = do
  {- ... -}
  x <- randomOracle
  {- ... -}
  y <- randomOracle
  {- ... -}

In order to run my program, I need to provide it with a random seed.

evalRandom :: Random result -> Gen -> result
evalRandom = evalState

For deterministic testing, we can pass fixed generators to evalRandom. If we use StdGen, we can map our Random program to IO and use the system random number generator.

type Gen = System.Random.StdGen
randomInt = System.Random.random

evalRandomIO :: Random result -> IO result
evalRandomIO = getStdRandom . runState

For the most general possible random number generator, the type for the generator state is simply an infinite stream of random values.

data Stream a = Cons a (Stream a)

unfoldStream :: (g -> (a, g)) -> g -> Stream a
unfoldStream next = go
  go seed = Cons value (go nextSeed)
    (value, nextSeed) = next seed

type Gen = Stream Int
randomInt (Cons hd tl) = (hd, tl)

evalRandomStdGen :: Random result -> StdGen -> result
evalRandomStdGen rr = evalRandom rr . unfoldStream System.Random.random

evalRandomIO :: Random result -> IO result
evalRandomIO rr = evalRandomStdGen rr <$> newStdGen

Before, when I was an intermediate Haskeller, I would probably stop at this point pretty satisfied with this. And let me be clear that this is a fine solution. However, now that I am an advanced Haskeller, I cannot help but feel a little dissatisfied with this solution. The problem with this implementation of the Random monad is that the type is too broad. Since the Random type is the State monad, there are operations allowed by the type that should not be allowed for a Random program. For instance, within the Random type, someone could store the state of the generator and restore it later causing random values to be replayed, or someone might completely replace the state of the generator with their own value.

One way to solve this problem is to use Haskell’s module system to abstract the monad and only expose the randomOracle operation. While this is a reasonable solution (in fact it is a very good solution as we will see), it would be nicer if we could instead use the type system to create a monad that is only capable of representing the programs we want to allow, and disallows other programs that would try manipulate the generator state in ways we do not want. Essentially we want our Random programs to only be able to query the random oracle, and that is it. After reflecting on this problem and the various kinds of monads I know about, I realized that the a suitable free monad captures exactly this notion of providing only an operation to query the random oracle. Specifically we want Control.Monad.Free.Free (Reader Int) or more directly (but more obtusely written) as Control.Monad.Free.Free ((->) Int). We truly want a free monad because any sequence of responses from the random oracle is valid.

One problem with this Free monad is that the bind operation can be slow because it needs to traverse through a, possibly long, data structure. There are several solutions to this, but for this particular free monad, I happen to know that the Van Laarhoven free monad representation is possible: The type forall m. Monad m => m Int -> m a is isomorphic to Control.Monad.Free.Free ((->) Int) a.

newtype Random a = Random { instantiateRandom :: forall m. Monad m => m Int -> m a }

instance Monad Random where
  return a = Random $ \_ -> return a
  ma >>= mf = Random . runReaderT
            $ ReaderT (instantiateRandom ma) >>= ReaderT . instantiateRandom . mf
instance Applicative Random where
  pure = return
  (<*>) = ap

instance Functor Random where
  fmap = liftM

In this representation, the random oracle function just fetches the argument.

randomOracle :: Random Int
randomOracle = Random id

For deterministic testing purposes we can evaluate the random monad by instantiating it with our state monad from before.

evalRandom :: Random result -> Stream Int -> result
evalRandom rr = evalState . instantiateRandom rr . state $ \(Cons hd tl) -> (hd, tl)

However, we can also directly instantiate it with the IO monad for production purposes.

evalRandomIO :: Random result -> IO result
evalRandomIO rr = instantiateRandom rr evalRandomIO

This is all very good; however, the advanced Haskeller in me still thinks that I ought to be able to write evalRandom without the need to invoke the State monad. This is because if we were actually using the Free ((->) Int monad, I would be writing evalRandom using iterA.

iterA :: (Applicative p, Functor f) => (f (p a) -> p a) -> Free f a -> p a 

evalFreeRandom :: Free ((->) Int) result -> Stream Int -> result
evalFreeRandom = iterA (\fpa (Cons hd tl) -> fpa hd tl)

No need for any state monad business in evalFreeRandom. How can I get that elegant solution with the Van Laarhoven free monad? One way would be to convert to the Free ((->) Int) representation

freeRandom :: Random result -> Free ((->) Int) result
freeRandom rr = instantiateRandom rr (liftF id)

evalRandom :: Random result -> Stream Int -> result
evalRandom = evalFreeRandom . freeRandom

Surely there must be a way to do this directly?

Before solving this I turned to another interesting problem. The iterA function recurses over the free monad structure to do its evaluation. The Free monad comes with its own general recursive elimination function called foldFree

foldFree :: Monad m => (forall x. f x -> m x) -> Free f a -> m a

This foldFree function is captures everything about the free monad, so it should be possible to write iterA by using foldFree to do all the recursion. But how is that even possible? The types of foldFree and iterA look too far apart. foldFree requires an natural transformation as input, and the arguments to iterA provide nothing resembling that.

To solve this I turned to the #haskell IRC channel for help. With assistance from glguy, ReinH, and byorgey, I turned the well known idea that if you want turn something to or from a natural transformation you use some sort of Yoneda / continuation like construction. In this particular case, the Cont (p a) monad seems to capture what we need. Following the types (and forgetting about the semantics) we end up the following.

iterA :: (Applicative p, Functor f) => (f (p a) -> p a) -> Free f a -> p a 
iterA h ff = runCont (foldFree (\fx -> Cont $ \k -> h (k <$> fx)) ff) pure

As an aside, glguy has a more “natural” solution, but it technically has a less general type, so I will not talk about here, even if I do feel it is better.

Turning back to our original problem of directly writing evalRandom without using the state monad, we can try to see if Cont will solve our problem.

evalRandom :: Random result -> Stream Int -> result
evalRandom rr = runCont (instantiateRandom rr (Cont $ \k (Cons hd tl) -> k hd tl)) const

We can compare the Cont solution to the State solution and see that the code is pretty similar.

evalRandom :: Random result -> Stream Int -> result
evalRandom rr = evalState (instantiateRandom rr (state $ \(Cons hd tl) -> (hd, tl)))

The inner Cont construction looks very similar to the inner state construction. The outer call to const in the Cont solution discards the final "state" which captures the same functionality that evalState has for the State solution. Now we can ask, which has better performance, the State solution, with its tupling and untupling of values, or the Cont solution which uses continuations instead? I will leave it to the GHC experts to figure that one out.

Arguably most of this is an exercise in academics, but it only took me a hour or three to go through this whole thought process. As an advanced Haskeller, I have slowly gathered, over many years, experience with these sorts of abstractions so that it starts becoming easy to do this sort of reasoning. While it may or may not matter much for my particular application, eventually this kind of reasoning becomes important. For example, the modern stream fusion in GHC exploits constructions that resemble this kind of reasoning, and that has had a big impact on performance.

For the non-advanced Haskellers out there, do not be deterred. Keep practicing your craft; keep reading about new abstractions, even if you do not fully get it; keep looking out for potential applications of those abstractions to solidify your understanding. Eventually you will have lots of very powerful problem solving tools at your disposal for making safe software.



Russell O’Connor: contact me