Some of you may have dismissed my last post, because both the strict and lazy vector normalization routines were O(N). According to the paper entitled “More Haste, Less Speed” there exists a problem that any strict, functional, (on-line) algorithm requires Ω(M log N) time, while there exists a lazy, functional, (on-line) algorithm that runs in O(M) time.
The problem, posed by Pippenger, is to take a stream of characters (or anything) of length M and group the characters into blocks of size N. Then apply a given permutation to each block. Finally output the resulting stream of characters. The lazy solution is described in the paper.
Crucial to the fast lazy program is the observation that instead of repeatedly applying a permutation to groups of n symbols, the same result can be obtained by one application of the permutation to a group of n sequences of symbols. The sequence of lists which are to be permuted is transposed, the transposed list of sequences is permuted once, and the permuted list of sequences is transposed back again [emphasis mine].
As usual, the lazy solution boggles the mind. It would seem that thunks set up wiring so that the incoming stream is accessed in such a way that the permutation is applied.
Perhaps it is best to not try and understand how lazy programs work and just be satisfied that it does.