BatchedQueue
does not fit in the model of persistence. Okasaki showed that the data structure gives you an amortized O(1) for tail but strictly in an ephemeral way where we have only one logical future for a sequence of operations. Let's consider a hypothetical situation with the scenario that we left with in the last post. The main problem why the data structure failed in the context of persistence was that by the time it reached the realization stage of logical future #2, we were left with no credits to pay for it. But what happens if we can ensure that the credits for logical future #2 has already been paid for before. That is, logical future #2 could reuse the computation that logical future #1 had already done before.Enter Memoization and call-by-need. Implementing the functional queue in a lazy language that supports call-by-need (like Haskell) will enable us to reuse the computation that one logical future has already performed.
It's called a
BankersQueue
that gives us amortized O(1) in a persistent setting. Here's the snippet that implements the data model..module BankersQueue (BankersQueue) where
import Prelude hiding (head, tail)
import Queue
data BankersQueue a = BQ Int [a] Int [a]
The basic stuff is the same as the
BatchedQueue
. In the implementation of the BatchedQueue
in a strict language, we would use lists with strict evaluation semantics. BankersQueue
need lazy lists and call-by-need semantics. In addition to the two lists, the BankersQueue
also maintains the lengths of the lists as part of the data structure. This information will be used when we try to determine when to reverse the rear
and append it to the front
.Lazy functional amortized data structures are not easy to analyze for time complexity. The main difficulty is the fact that operations don't get executed when they seem to. Instead we have suspensions or thunks that get created, only to be forced when actually needed. Okasaki's framework based on debits suggests that suspensions can be forced only if the debts assigned to them (proportional to the shared cost of the operation) has been paid off. I am not going into the details of explaining the banker's method framework. The basic idea is that you can enjoy your pie only if you have completely paid for it. There has to be a linear and sequential ordering between the following three stages in the lifecycle of a suspension.
Create a Suspension (1) => Payoff a Suspension (2) => Force a Suspension (3)
In the implementation for
BatchedQueue
we were doing the reverse
of the rear
only when the front
becomes empty. In the banker's method based on debits, doing so will not give us sufficient time to pay for the entire reverse
operation. Remember the crucial thing is to ensure that the following sequence is honored:- calling
reverse
creates a suspension and we assign a debt to it proportional to the shared cost of the operation. In our case it's the length of the rear list. - operations must discharge debits enough to pay off the debt assigned to the suspension for
reverse
. - actually execute
reverse
since it's debt has been paid off.
reverse
, we have to discharge an equivalent number of debits before the suspension for reverse
is forced. And since the debt on the suspension is |rear|, this is possible only if we force the reverse
after |rear| operations. Every tail
discharges one debit off the front
and we need to force the reverse
after |front| operations. Which means that, after |front| operations we must be ready to execute reverse
assuming that we have paid off all debts associated with it. This is possible if we create the suspension when |rear| just exceeds |front|. The following changes to the check
function ensures this ..check lenf f lenr r =
if lenr <= lenf then BQ lenf f lenr r
else BQ (lenf + lenr) (f ++ reverse r) 0 []
Have a look at the following diagram that shows how
reverse
gets it done in O(1) amortized in a persistent setting.Note how laziness with call-by-need is the secret sauce behind the numbers.
BankersQueue
can also be implemented in languages that offer optional laziness. Scala can implement it with Stream
, which is a lazy sequence and lazy val
, that offers call by need semantics. However an interesting benchmark can be done with the corresponding implementation in Haskell. It remains to be seen how the overhead of creating and managing suspensions can have an impact on the runtime complexity.
No comments:
Post a Comment