An asymptotically optimal Bernoulli factory for certain functions that can be expressed as power series

Given a sequence of independent Bernoulli variables with unknown parameter $p$, and a function $f$ expressed as a power series with non-negative coefficients that sum to at most $1$, an algorithm is presented that produces a Bernoulli variable with parameter $f(p)$. In particular, the algorithm can simulate $f(p)=p^a$, $a\in(0,1)$. For functions with a derivative growing at least as $f(p)/p$ for $p\rightarrow 0$, the average number of inputs required by the algorithm is asymptotically optimal among all simulations that are fast in the sense of Nacu and Peres. A non-randomized version of the algorithm is also given. Some extensions are discussed.


Introduction
Let X = (X i ) i∈N denote a sequence of independent, identically distributed (i.i.d.) Bernoulli random variables X i with unknown parameter p. A non-randomized stopping rule on X is a sequence of stopping functions τ 1 (x 1 ), τ 2 (x 1 , x 2 ), . . . , τ i (x 1 , . . . , x i ), . . . with τ i : {0, 1} i → {0, 1}. Given a realization x 1 , x 2 , . . . of X, the stopping time N is the smallest integer i for which τ i (x 1 , . . . , x i ) equals 1, or infinity if this does not occur. N is assumed to be finite almost surely; that is, for almost all realizations of X at least one of the stopping functions takes the value 1.
A randomized stopping rule uses, in addition to X, a randomizing sequence U = (U i ) i∈N of independent random variables with uniform distribution on (0, 1). The stopping function τ i depends on x 1 , u 1 ; x 2 , u 2 ; . . . ; x i , u i , where each u j represents the value taken by the random variable U j ; and is assumed to be measurable. The stopping time N is defined as before, and is again assumed to be finite almost surely.
Another possible definition of a randomizing stopping rule would be to specify that at each i, given X 1 = x 1 , . . . , X i = x i , there is a certain probability of stopping that depends on x 1 , . . . , x i . This corresponds to the above definition if the output of τ i is obtained from comparing U i with a threshold that depends on x 1 , . . . , x i . Thus the definition based on the auxiliary sequence U captures all the randomness that can be effected by a randomized stopping rule (and is more convenient for the purposes of this paper).
Let f : S → (0, 1) denote a function defined on a set S ⊆ (0, 1), and let X be a sequence of independent Bernoulli variables with parameter p ∈ S. A non-randomized Bernoulli factory of f based on X is an algorithm that, using values from the sequence X as inputs, generates a Bernoulli variable Y with parameter f (p). Specifically, the number N of required inputs is a stopping time on X dictated by a non-randomized stopping rule; and the output value Y depends on X 1 , . . . , X N .
A randomized Bernoulli factory uses, in addition to the sequence X, an auxiliary sequence U of independent random variables uniformly distributed on (0, 1). Specifically, N is given by a randomized stopping rule on X with U as randomizing sequence. The output Y is also possibly randomized, that is, it depends on X 1 , . . . , X N and U 1 , . . . ,U N . Specifically, there exists a sequence of functions γ 1 , γ 2 , . . . such that for N = n and for X 1 = x 1 , U 1 = u 1 ; . . . ; X n = x n , U n = u n the output is given as γ n (x 1 , u 1 ; . . . ; x n , u n ). These functions are assumed to be measurable.
The use of the term "randomized" in the above definitions of randomized stopping rules or Bernoulli factories refers to the fact that the stopping time N and the factory output Y are random even if conditioned on the input sequence X, due to the additional source of randomness represented by U. In the following, a Bernoulli factory will also be referred to as a simulation.
One of the earliest references of a Bernoulli factory, for the specific case f (p) = 1/2, appears in a work by von Neumann [1]. For general f , Keane and O'Brien [2] give a necessary and sufficient condition for a simulation of f to be possible, namely that the function either is constant, or is continuous and satisfies a certain polynomial bound. Nacu and Peres [3] define a non-randomized simulation to be fast if the distribution of N has an exponential tail, that is, for any p ∈ S there exist values A > 0, β < 1 (which may depend on p) such that Pr[N > n] ≤ Aβ n .
The authors prove that a fast simulation exists for any f real analytic on any closed interval contained in (0, 1). In this paper, the definition will be extended to randomized simulations, which will be considered fast if they satisfy (1). Several works on Bernoulli factories present simulation algorithms for specific functions. Considerable attention has been given to f (p) = min{2p, 1 − 2ε}, ε > 0, which is an important building block for simulating other functions [3,4]; and to linear functions f (p) = cp, c > 1 defined on a suitable set S ⊂ (0, 1/c); see the work by Huber [5,6].
A crucial parameter of a Bernoulli factory is E[N], that is, how many inputs X i are required on average to generate a sample of Y . In applications, observing the variables X i is usually costly, and thus E[N] should be as small as possible. For randomized Bernoulli factories, the auxiliary random variables U i are assumed to be cost-free.
This paper deals mainly with functions of the form f (p) = 1 − ∑ ∞ k=1 c k (1 − p) k where the coefficients c k are non-negative and sum to 1. A randomized algorithm to simulate any such function is presented in Section 2. The algorithm is shown to be fast in the sense of (1), and its average number of inputs, E[N], is computed. The algorithm can be particularized to functions f (p) = p a , a ∈ (0, 1). For a = 1/2 the algorithm is similar to that given by Wästlund [7]; and the presented results affirmatively answer question 1 from [3], i.e. establish that √ p can be simulated with finite E[N].
An interesting subclass of functions is formed by those that, in addition to having a power series expression as above, satisfy the following two conditions: f (p)/p → ∞ as p → 0, and the derivative f ′ (p) asymptotically grows at least as fast as f (p)/p. In particular, this includes all functions that behave asymptotically like bp a , a ∈ (0, 1), b ∈ (0, ∞). For these functions, it will be seen that any fast simulation algorithm has an average number of inputs E[N] that grows without bound as p → 0. Therefore it is important to analyse the asymptotic rate of growth of E[N]. This analysis is presented in Section 3. The results show that the proposed algorithm is asymptotically optimal for the mentioned subclass of functions, in the sense that for any other fast algorithm E[N] grows at least as fast with p.
A non-randomized version of the proposed algorithm is given in Section 4, and is also shown to be fast. Section 5 discusses some extensions of the algorithms to cover a broader range of functions. Section 6 presents conclusions and open problems. Section 7 contains proofs to all results.
The following notation is used throughout the paper. x (m) represents the rising factorial

Simulation algorithm. Average number of inputs
Consider an i.i.d. sequence X of random variables X i that take the value 1 with probability p and 0 with probability 1 − p. Let f : (0, 1) → (0, 1) be a function that can be expressed as a power series Note that this implies that f is differentiable, and lim p→0 f (p) = 0, lim p→1 f (p) = 1.
The randomized algorithm to be presented yields a Bernoulli random variable Y with parameter f (p). It takes as inputs a number of random variables from the sequence X, as well as from an auxiliary sequence U of i.i.d. random variables U i uniformly distributed on (0, 1) and independent from the X i variables. The algorithm makes use of coefficients d k computed from c k as follows: From (3) and (4) it stems that 0 ≤ d k ≤ 1. If the number of non-zero coefficients c k is finite, i.e. if there exists K such that c K > 0 and c k = 0 for k > K, (5) gives d K = 1; and for k > K the coefficient d k is not defined (and is not necessary, as will be seen).
Algorithm 1. Let f be given by (2)-(4), and let d k be defined by (5). The input of the algorithm is a sequence X of i.i.d. Bernoulli random variables. The output is a Bernoulli random variable Y .
If V i or X i are 1, output Y = X i and finish. Else increase i and go back to step 2.
The idea of this algorithm is similar to that presented by Wästlund [7] to simulate f (p) = √ p, namely, decompose the event Y = 0 as an infinite sum of mutually exclusive events, each with probability c k (1 − p) k . However, there are two differences here. First, the referenced paper treats the c k and (1 − p) k parts separately. Namely, an auxiliary random variable L is first generated with Pr[L = k] = c k , k ∈ N. This variable represents the amount of inputs X i that need to be taken. Then, On the other hand, Algorithm 1 reduces the number of required inputs X i by stopping as soon as one of the X i variables is 1. This can be done because in that case the above product is 0 irrespective of the values of the remaining X i variables.
The second difference is that Algorithm 1 uses auxiliary Bernoulli variables V i with respective parameters d i , instead of a random variable L with the distribution given by coefficients c k . The latter approach, used in [7], was probably motivated by the fact that for f (p) = √ p the coefficients c k in (2) are and thus simulating L is particularly easy, as it lends itself to a simple probabilistic interpretation. Specifically, if a fair coin is flipped until the total number of tails exceeds the total number of heads, the probability that this happens after 2k −1 steps is precisely (6). For other functions it is still possible to simulate L from fair coin flips, or from a uniform random variable, as long as conditions (3) and (4) are satisfied; but a simple probabilistic experiment analogous to that for f (p) = √ p may not exist. The Bernoulli random variables V i provide an alternative, which can also be used for any coefficients c k that satisfy the indicated conditions. Effectively, each d k represents the conditional probability that L = k given that L ≥ k. The following proposition makes this clear.

Proposition 1. The coefficients d k defined by (5) satisfy
Additionally, this interpretation of d k as a conditional probability makes it clear that if d K = 1 for some K (which occurs if the series in (2) is finite with c K > 0, c k = 0 for k > K) it is unnecessary to define coefficients d k for k > K.
The algorithm can also be phrased as a particular case of the reverse-time martingale approach of Łatuszyński, Kosmidis, Papaspiliopoulos and Roberts with random bounds [4, algorithm 3]. Specifically, from (2) it is possible to obtain monotone sequences of random upper bounds and lower bounds that depend on the observed inputs, such that the sufficient conditions for the referenced algorithm are satisfied. This approach has the additional advantage that condition (4) is not required. Note, however, that this restriction of Algorithm 1 is immaterial, because the coefficients can always be scaled to sum 1 and then the output Y can be multiplied by an independent Bernoulli variable with parameter equal to the desired sum (see Section 5). Theorem 2. For f given by (2)-(4) and p ∈ (0, 1), the average number of inputs required by Algorithm 1 is In addition, the algorithm is fast in the sense of (1).
It is interesting to consider the case f (p) = p a , a ∈ (0, 1). This can be expressed in the form (2) with

Asymptotic optimality
A natural question is whether the average number of inputs required by Algorithm 1 can be improved by some other algorithm. The following proposition and theorem are useful steps towards the answer.

Proposition 2.
Given an open set S ⊆ (0, 1), consider a function f : S → (0, 1) and a (possibly randomized) Bernoulli factory for f that is fast, as defined by (1)

. Then E[N] is finite and is a continuous function of p ∈ S.
This proposition, which will be used to prove Theorem 3, can be given an intuitive interpretation as follows. Since the stopping functions only depend on the values of the sequence X and the distribution of those values varies smoothly with p, it seems reasonable to expect E[N] to be a continuous function of p. As established by the proposition, fastness of the Bernoulli factory is indeed sufficient to ensure this.

Theorem 3.
Consider an open set S ⊆ (0, 1), a differentiable function f : S → (0, 1) and a (possibly randomized) Bernoulli factory for f that is fast in the sense of (1). Then .
The main idea in the proof of this theorem is as follows. Given an arbitrary, possibly randomized factory for f that uses the input sequence X with parameter p, the output Y can be seen as a sequential estimator of f (p). Wolfowitz's extension of the Cramér-Rao bound to sequential estimators [8] can be applied to Y . This sets a lower bound on Var[Y ] that depends on E[N]. Comparing with the actual variance gives the claimed lower bound on E[N].
The proof technique has some similarities with those used by Huber [5,6]. Specifically, the method employed in [5] to establish a lower bound on E[N] for linear functions f (p) = cp, c > 1, p ∈ (0, (1 − ε)/c) is also based on considering the Bernoulli factory as an estimator of f (p); but instead of the Cramér-Rao inequality, a different bound is used which relates E[N] to the confidence level for a given interval. In [6] the Cramér-Rao inequality is used, albeit informally, to provide evidence for a lower bound on E[N] for linear functions.
In order to compare Algorithm 1 with others in terms of E[N], the most interesting case is that of functions f for which this algorithm gives E In this case, since the average number of inputs used by Algorithm 1 grows without bound, it is important to know if the growth rate could be reduced by using some other algorithm. As will be established by Theorem 4, for a certain subclass of these functions the average number of inputs required by any fast algorithm increases, as p → 0, at least as fast as it does with Algorithm 1, which is thus asymptotically optimal. Consider the class of functions f : (0, 1) → (0, 1) given by (2)-(4) that satisfy (11) and Conditions (11) and (12) mean that, asymptotically, f (p) increases faster than p and f ′ (p) increases at least as fast as f (p)/p. In particular, they are fulfilled by any differentiable function f such that Indeed, it is clear that (11) holds if (13) does. As for (12), observe that (13) implies lim p→0 f (p) = 0, and thus by L'Hôpital's rule Dividing (14) by (13) it is seen that lim p→0 f ′ (p)p/ f (p) = a, which implies (12). Some examples of functions of the form (2)-(4) for which (11) and (12) hold are given by the next proposition. (2)-(4) and satisfy (11) and (12):

Proposition 3. The following functions can be expressed as in
Note that functions (15)-(18) satisfy the more specialized condition (13), whereas (19) does not.
The following theorem and its corollary establish that, for the class of functions defined above, Algorithm 1 is asymptotically optimum among all Bernoulli factories that are fast in the sense of Nacu and Peres.
Theorem 4. Let S ⊆ (0, 1) be an open set that has 0 as a limit point. Consider a differentiable function f : S → (0, 1) for which (12) holds. Any (possibly randomized) Bernoulli factory for f that is fast in the sense of (1) satisfies Note that this theorem does not require (11). However, for functions that do not satisfy this condition the result (20) is less interesting, and indeed a stronger bound on E[N] can be found. Namely, for a non-constant function any algorithm needs to take at least one input from X, and thus E[N] ≥ 1, which is Ω( f (p)/p) if (11) does not hold. On the other hand, a constant function only satisfies the hypotheses of the theorem if it is the null function; and in any case, constant functions can be simulated by a randomized algorithm without observing X (only U is needed).

Corollary 1.
For any function f that can be expressed as (2)-(4) and satisfies conditions (11) and (12), Algorithm 1 is asymptotically optimal as p → 0 among all fast algorithms; that is, for any algorithm that satisfies condition (1), The results presented in this section are somewhat related to other results that have appeared in previous works. Elias [9] considers a non-randomized Bernoulli factory for the function f (p) = 1/2, obtains an entropy-based bound on the average number of outputs per input, and gives an algorithm that approaches that bound. Peres [10] shows that iterating von Neuman's procedure achieves the same efficiency. Stout and Warren [11] carry out a similar analysis for the average number of inputs per output required for simulating f (p) = 1/2, and also propose several algorithms. Huber [5,6] . This is the bound that was conjectured in [6, section 4] based on an informal argument, which has thus been formalized (and generalized) by Theorem 3. As for Theorem 4, it cannot be directly compared with the bounds in the above referenced works because, as previously mentioned, the theorem does not apply to the constant function f (p) = 1/2, and for linear functions it reduces to the trivial lim inf p→0 E[N] > 0.

Non-randomized algorithm
A non-randomized version of Algorithm 1 will be given next. Instead of using an auxiliary variable U i to produce a Bernoulli variable V i with parameter d i in step 3, the required V i is obtained from additional input samples, using a well known procedure based on the binary expansion of d i [3, proof of proposition 13]. Note also that a variation of [4, algorithm 3] could be used for the same purpose, with progressively finer truncations of the binary representation of d i providing the lower and upper bounds required therein. Consider the binary expansion of d i ∈ [0, 1] assuming zero as integer part and an infinite amount of digits in the fractional part. Thus the decimal values 0, 0.75 and 1 are respectively expressed in binary as 0.000 · · ·, 0.11000 · · · (or equivalently 0.10111 · · ·) and 0.11111 · · ·. Algorithm 2. The algorithm uses the same steps 1-4 from Algorithm 1 except that step 3 is replaced by the following: The total number of inputs taken from X is obviously greater than with Algorithm 1. However, the final value of i in Algorithm 1 has an exponential tail, which can be used for establishing that Algorithm 2 is fast as defined by Nacu and Peres.

Theorem 5. Algorithm 2 requires an average total number of inputs
In addition, the algorithm satisfies (1).
The value of E[N] attained by Algorithm 2 could be improved in several ways. Firstly, if d i ∈ [0, 1] happens to be a dyadic number, i.e. its binary expansion has an infinite trail of zeros or ones starting at the m-th digit, step 3.2 is unnecessary for j ≥ m (once j reaches m, the output V i is known to be the repeating digit). This can lead to a lower E[N] for certain functions. Secondly, step 3.2, which is von Neumann's procedure for obtaining a Bernoulli variable with parameter 1/2, can be replaced by more efficient approaches; see for example [9] and [10]. Lastly, instead of Algorithm 2 a modification of [4, algorithm 3] could be used, replacing the auxiliary uniform random variable required therein by a procedure similar to step 3 of Algorithm 2.

Extensions of the algorithms
The presented algorithms can be modified in several ways to extend the range of functions that can be simulated. Algorithm 1 will be considered in the following, but the discussion also applies to its non-randomized version given by Algorithm 2.
An obvious modification is to change step 4 of the algorithm so that instead of Y = X i it outputs the complementary variable Y = 1 − X i . This simulates the function g(p) = 1 − f (p). The average number of inputs and asymptotic optimality of the modified algorithm are unaffected. Equation (2) and conditions (11) and (12) are modified replacing f (p) by 1 − g(p), whereas (3) and (4) are maintained.
The same operation can be applied to the input variables in step 2. This allows simulation of functions g(p) = f (1 − p), where f satisfies the conditions of the original algorithm; and the simulation is asymptotically optimal for p → 1.
It is also possible to simulate a function obtained from applying certain operations to two constituent functions f 1 and f 2 . Define the functions f (composition of f 1 , f 2 ), g (product with complement) and h (convex combination) as follows: Proposition 4. Consider f 1 , f 2 that can be expressed as in (2)-(4). (2)-(4).

• f , g and h can also be expressed as in
• f , g and h satisfy (11) if f 1 and f 2 do.
• f , g and h satisfy (12) if f 1 and f 2 do.
By Proposition 4, if f 1 and f 2 satisfy (2)-(4) Algorithm 1 can be used to directly simulate f , g or h. Alternatively, it is possible to simulate f 1 and f 2 separately and then combine the results to obtain the desired function. In the three cases this alternative approach is easily seen to require the same average number of inputs as applying Algorithm 1 to f , g or h. Consider for example the case of function f . In the alternative approach, Algorithm 1 is first applied to simulate f 1 with input sequence X. This produces a sequence of Bernoulli variables with parameter f 1 (p). Then the algorithm is applied again to simulate f 2 on this sequence. The first stage requires f 1 (p)/p inputs on average. The second uses on average f 2 ( f 1 (p))/ f 1 (p) outputs of the first stage as inputs. The average number of original inputs is the product of those two numbers, which equals f (p)/p.
Of course, other combinations of functions may be realizable, even if the resulting function cannot be simulated directly by a single application of the algorithm. For example, if f 1 , f 2 satisfy the conditions for Algorithm 1, it is immediate to simulate f (p) = f 1 (p) f 2 (p) by multiplying the outputs for f 1 and f 2 (the average number of required inputs can be reduced by not computing the second output if the first is 0). However, it may not be possible to simulate f directly because its coefficients c k are not necessarily non-negative. Similarly, given α ∈ (0, 1), multiplying the output for f 1 by a Bernoulli variable with parameter α simulates α f 1 . This allows relaxing the restriction (4) to ∑ ∞ k=1 c k ≤ 1.

Conclusions and future work
An algorithm has been presented that can simulate certain functions f using an average number of inputs that, within the class of fast algorithms (in the sense of [3]), is asymptotically optimal for p vanishingly small. This algorithm generalizes that given in [7] for f (p) = √ p, uses fewer inputs, and admits a non-randomized version.
In future research, it would be interesting to relax the sufficient condition (12) for asymptotic optimality (Theorem 4), perhaps using a different proof technique. The class of Bernoulli factories to which Theorems 3 and 4 apply (namely, fast factories) could be extended if the continuity of E[N] as a function of p (Proposition 2) could be proved under more general conditions. It would also be useful to extend the algorithm to a more general class of functions, especially in relation to condition (3). In this regard, [4, section 3.1] gives a method to simulate functions with alternating series expansions.

Proof of Theorem 1
The algorithm ends after taking n inputs producing output Y = 0 if and only if X i = 0, V i = 0 for i ≤ n − 1; V n = 1; and X n = 0. Since all these variables are independent, which by Proposition 1 equals c n (1 − p) n . Therefore

Proof of Theorem 2
The algorithm requires at least n inputs if and only if Thus Making use of Proposition 1 and equation (5), (32) Since all the terms in the double series are non-negative, the order of summation can be changed. This gives, taking into account (2), Using the fact that all coefficients d i are upper-bounded by 1, (30) yields and thus (1) holds with A = 1, β = 1 − p < 1.

Proof of Proposition 2
Let S be an open subset of (0, 1), and let f : S → (0, 1) be a function. Consider an arbitrary, randomized Bernoulli factory B for f based on the sequence X with parameter p, randomizing sequence U, stopping functions τ i (x 1 , u 1 ; . . . , x i , u i ), and output functions γ i (x 1 , u 1 ; . . . , x i , u i ), i ∈ N, all assumed to be measurable.
For clarity, the following definitions will be used, which explicitly show the dependence of certain probabilities on p: Φ n (p) = Pr[N ≥ n]; ϕ n (p) = Pr[N = n]; and ϕ n,y (p) = Pr[N = n, Y = y] for y ∈ {0, 1}.
The randomized factory B can be replaced by an equivalent, non-randomized sequential procedure B ′ that produces the same output using an input sequence Z defined by Z i = X i + U i . The equivalence is clear from the fact that X i and U i can be retrieved from Z i as . Let the random variable N represent the number of Z i inputs used by B ′ (or of X i inputs used by B). The sequence Z will be said to have parameter p if the underlying X sequence has parameter p. Note that the stopping time N is randomized from the point of view of X, but is non-randomized with respect to Z.
The probability ϕ n (p) = Pr[N = n] is computed as the sum of 2 n terms π(x 1 , . . . , x n ), each associated to an n-tuple (x 1 , . . . , x n ) ∈ {0, 1} n , where π(x 1 , . . . , x n ) is the probability that the first n inputs of X are x 1 , . . . , x n and B stops at N = n. With r = x 1 + · · ·+ x n , this can be expressed in terms of the stopping functions τ i as follows: The factors in (39) involving stopping functions do not depend on p. Thus π(x 1 , . . . , x n ) is a polynomial in p, and so is ϕ n (p). This implies that Φ n (p) = 1 − ∑ n−1 i=1 ϕ i (p) is also a polynomial, and thus a continuous function of p. Therefore, taking into account that to establish the finiteness of E[N] and its continuity as a function of p ∈ S it suffices to show that the above series converges uniformly on any interval [ζ , η] ⊂ S. Consider ζ , η > 0 arbitrary such that [ζ , η] ⊂ S. By assumption B is fast, that is, Φ n+1 (p) satisfies a bound of the form (1). According to [3, proposition 21], this bound can be made uniform on [ζ , η]. Thus, there exist A and β independent of p such that Φ n (p) ≤ Aβ n for all p ∈ [ζ , η]. This implies that, given t ∈ N,
Applying the sequential procedure B ′ to inputs taken from Z with parameter p produces a Bernoulli variable Y with parameter f (p). The key idea of the proof is to consider Y as an estimator of f (p). Namely, the variance of Y is Substituting this into inequality (47) from Lemma 2, to be proved below, will give (10). This lemma, in turn, uses the result in Lemma 1.
Proof. It suffices to prove that, given an arbitrary interval (ζ , η) ⊂ S with ζ > 0, η < 1, (47) holds for all p ∈ (ζ , η). This will be done using Wolfowitz's extension of the Cramér-Rao bound to sequential estimators [8], which particularized to B ′ and Y will give the desired result. Consider (ζ , η) ⊂ S, ζ > 0, η < 1. The sequential version of the Cramér-Rao bound will hold on this interval if the five regularity conditions enunciated in [8, section 3] are satisfied. The first condition specifies that p must belong to an open interval, which is indeed the case.
The second regularity condition requires that ∂ λ (z; p)/∂ p exist for all p and almost all z, and that E[∂ log λ (Z; p)/∂ p] = 0 and E[(∂ log λ (Z; p)/∂ p) 2 ] > 0 for all p ∈ (ζ , η), where Z is a generic variable from the sequence Z. This easily follows from (35) by computing and ∂ log λ (z; p) which is strictly positive as required.
As indicated, the result in Theorem 3 readily follows from (42) and Lemma 2.

Proof of Theorem 4
The proof uses a standard argument based on the definition of Ω-type asymptotic growth and on Theorem 3.

Proof of Theorem 5
The following result about geometric random variables will be used.
Steps 2-4 of Algorithm 2 (which are the same as in Algorithm 1 except step 3) form an (outer) loop on i which is repeated until the exit condition in step 4 is met. For each iteration of this loop, Algorithm 1 uses one input from X; whereas Algorithm 2 uses that input plus additional ones that are needed for generating V i , as specified by steps 3.1-3.3. For a given i, the number of iterations of the (inner) loop on j formed by steps 3.2 and 3.3 is a geometric random variable K i with parameter 1/2, and thus E[K i ] = 2. For each j, the number of blocks of 2 inputs required within step 3.2 (which can be regarded as a third-level, innermost loop) is a geometric random variable L i, j with parameter 2p(1 − p), and thus E[L i, j ] = 1/(2p (1 − p)). Let L i = L i,1 + · · · + L i,K i .
The variables L i, j are i.i.d. and independent from K i , and therefore [18, page 194 1 − p)). Thus, for each i Algorithm 2 uses one input from X in step 2 (like Algorithm 1 does), plus 1/(p(1 − p)) 2-input blocks on average in steps 3.1-3.3. Consequently, Algorithm 2 uses on average 1 + 2/(p(1 − p)) as many inputs as Algorithm 1 does. This proves (21). By Lemma 3, the variables L i, j as well as K i have exponential tails; and then [3, proposition 12] guarantees that L i has an exponential tail. For each i, the iteration formed by steps 2-4 of Algorithm 2 requires 1 + 2L i inputs. The total number of iterations of the outer loop on i coincides with the number of inputs of Algorithm 1, which has an exponential tail as established by Theorem 2. Since 1 + 2L i also has an exponential tail, applying [3, proposition 12] again shows that the total number of inputs used by Algorithm 2 has an exponential tail, that is, satisfies (1).