Recently some discussion on the Haskell IRC channel suggested that using continuation passing style can increase the performance of monadic code. I decided to try this with my 2007 ICFP contest code. The results were stunning.
I had previously updated my monad to make it more precise, and to avoid using the unoptimized Ubuntu MTL package.
The previous monad data type constructor was (approximately)
data DNAState a = Done
                | More DNA a
newtype DNAMonad a = DNAMonad
 { xrunDNA :: DNA -> ([RNA]->[RNA], DNAState a)}
which would be StateT DNA (MaybeT (Writer (Endo [RNA]))) if using the MTL.
I created a new monad that does exactly the same thing but using CPS.
newtype DNAMonad a = DNAMonad 
 { unDNA :: forall r. (Cont (DNA -> ([RNA], DNAState r)) a)}
This would be forall r. ContT r (StateT DNA (MaybeT (Writer [RNA]))) with the MTL.
I also copied code from the Control.Monad.Cont module so that the operations on Cont are inlined.
The previous codecode without using CPS took about 8 minutes to run Endo’s DNA.  The new CPS monad takes about 2 minutes!
These timings are not  comparable to my previous article because I’m nowI switched to using the Data.Sequence for the DNA representation between now and then, and I’m running it on my laptop.  None the less, a four fold improvement probably puts me on par with the lisp implementation.
My only problem is that I don’t really understand how it manages to run so much faster. The implementation of bind is notably different. The old code had to deconstruct and reconstruct tuples.
  a >>= f = DNAMonad $
    \dna -> let (r1,s1) = xrunDNA a dna
             in case s1 of
                  Done -> (r1, Done)
                  More	 dna1 a1 ->
                    let (r2,s2) = xrunDNA (f a1) dna1
                     in (r1 . r2, s2)
The new code just calls Cont’s bind and doesn’t touch any tuples.
a >>= f = DNAMonad (unDNA a >>= (\x -> unDNA (f x)))
I guess the result value a has been pulled from inside the structure to outside the structure.  Still, I have no intuition about the run time effects of using CPS.  What good is my degree in computer science?
It is worth noting that I’ve replace the difference list [RNA] -> [RNA] with simply [RNA] because I believe the CPS style automatically gives me the right associativity of list concatenation that I need.
Thanks goes to ddarius, ski and Saizan.