Lacunary series and stable distributions

By well known results of probability theory, any sequence of random variables with bounded second moments has a subsequence satisfying the central limit theorem and the law of the iterated logarithm in a randomized form. In this paper we give criteria for a sequence $(X_n)$ of random variables to have a subsequence $(X_{n_k})$ whose weighted partial sums, suitably normalized, converge weakly to a stable distribution with parameter $0<\alpha<2$.


Introduction
It is known that sufficiently thin subsequences of general r.v. sequences behave like i.i.d. sequences. For example, Chatterji [9], [10] and Gaposhkin [15], [16] proved that if a sequence (X n ) of r.v.'s satisfies sup n EX 2 n < ∞, then one can find a subsequence (X n k ) and r.v.'s X and Y ≥ 0 such that 2N log log N k≤N (X n k − X) = Y 1/2 a.s., (1.2) where N (0, Y ) denotes the distribution of the r.v. Y 1/2 ζ where ζ is an N (0, 1) r.v. independent of Y . Komlós [19] proved that under sup n E|X n | < ∞ there exists a subsequence (X n k ) and an integrable r.v. X such that lim N →∞ 1 N N k=1 X n k = X a.s. and Chatterji [8] showed that under sup n E|X n | p < ∞, 0 < p < 2 the conclusion of the previous theorem can be changed to for some X with E|X| p < ∞. Note the randomization in all these examples: the role of the mean and variance of the subsequence (X n k ) is played by random variables X, Y . On the basis of these and several other examples, Chatterji [11] formulated the following heuristic principle: Subsequence Principle. Let T be a probability limit theorem valid for all sequences of i.i.d. random variables belonging to an integrability class L defined by the finiteness of a norm · L . Then if (X n ) is an arbitrary (dependent) sequence of random variables satisfying sup n X n L < +∞ then there exists a subsequence (X n k ) satisfying T in a mixed form.
In a profound paper, Aldous [1] proved the validity of this principle for all limit theorems concerning the almost sure or distributional behavior of a sequence of functionals f k (X 1 , X 2 , . . .) of a sequence (X n ) of r.v.'s. Most "usual" limit theorems belong to this class; for precise formulations, discussion and examples we refer to [1]. On the other hand, the theory does not cover functionals f k containing parameters (as in weighted limit theorems) or allows limit theorems to involve other type of uniformities. Such uniformities play an important role in analysis. For example, if from a sequence (X n ) of r.v.'s with finite p-th moments (p ≥ 1) one can select a subsequence (X n k ) such that for some constant 0 < K < ∞, for every N ≥ 1 and every (a 1 , . . . , a N ) ∈ R N , then the subspace of L p spanned by (X n ) contains a subspace isomorphic to Hilbert space. Such embedding arguments go back to the classical paper of Kadec and Pelczynski [18] and play an important role in Banach space theory, see e.g. Dacunha-Castelle and Krivine [12], Aldous [2]. In the theory of orthogonal series and in Banach space theory we frequently need subsequences (f n k ) of a sequence (f n ) such that ∞ k=1 a k f n k converges a.e. or in norm, after any permutation of its terms, for a class of coefficient sequences (a k ). Here we need uniformity both over a class of coefficient sequences (a k ) and over all permutations of the terms of the series. A number of uniform limit theorems for subsequences have been proved by ad hoc arguments. Révész [22] showed that for any sequence (X n ) of r.v.'s satisfying sup n EX 2 n < ∞ one can find a subsequence (X n k ) and a r.v. X such that ∞ k=1 a k (X n k − X) converges a.s. provided ∞ k=1 a 2 k < ∞. Under sup n X n ∞ < +∞, Gaposhkin [15] showed that there exists a subsequence (X n k ) and r.v.'s X and Y ≥ 0 such that for any real sequence (a k ) satisfying the uniform asymptotic negligibility condition and for any real sequence (a k ) satisfying the Kolmogorov condition For a fixed coefficient sequence (a k ) the above results follow from Aldous' general theorems, but the subsequence (X n k ) provided by the proofs depends on (a k ) and to find a subsequence working for all (a k ) simultaneously requires a uniformity which is, in general, not easy to establish and it can fail in important situations. (See Guerre and Raynaud [17] for a natural problem where uniformity is not valid.) In [1], Aldous used an equicontinuity argument to prove a permutation-invariant version of the theorem of Révész above, implying that every orthonormal system (f n ) contains a subsequence (f n k ) which, using the standard terminology, is an unconditional convergence system. This had been a long standing open problem in the theory of orthogonal series (see Uljanov [24], p. 48) and was first proved by Komlós [20]. In [3] we used the method of Aldous to prove extensions of the Kadec-Pelczynski theorem, as well as selection theorems for almost symmetric sequences. The purpose of the present paper is to use a similar technique to prove a uniform limit theorem of probabilistic importance, namely the analogue of Gaposhkin's uniform CLT (1.3)-(1.4) in the case when the limit distribution of the normed sum is a stable law with parameter 0 < α < 2. To formulate our result, we need some definitions. Using the terminology of [6], call the sequence (X n ) of r.v.'s determining if it has a limit distribution relative to any set A in the probability space with P (A) > 0, i.e. for any for all continuity points t of F A . By an extension of the Helly-Bray theorem (see [6]), every tight sequence of r.v.'s contains a determining subsequence. Hence in studying the asymptotic behavior of thin subsequences of general tight sequences we can assume without loss of generality that our original sequence (X n ) is determining. By [6], Proposition 2.1, for any continuity point t of the limit distribution function F Ω , the sequence I{X n ≤ t} converges weakly in L ∞ to some r.v. G t ; clearly G s ≤ G t a.s. for any s ≤ t. (A sequence (ξ n ) of bounded r.v.'s is said to converge to a bounded r.v.
To avoid confusion, we will call ordinary weak convergence of probability theory distributional convergence). Using a standard procedure (see e.g. Révész [23], Lemma 6.1.4), by choosing a dense countable set D of continuity points of F Ω , one can construct versions of G t , t ∈ D such that, for every fixed ω ∈ Ω, the function G t (ω), t ∈ D extends to a distribution function. Letting µ denote the corresponding measure, µ is called the limit random measure of (X n ); it was introduced by Aldous [1]; for properties and applications see [2], [3], [5], [6]. Clearly, µ can be considered as a measurable map from the underlying probability space (Ω, F, P ) to the space M of probability measures on R equipped with the Prohorov metric π. It is easily seen that for any A with P (A) > 0 and any where E A denotes conditional expectation given A. Note that µ depends on the actual r.v.'s X n , but the distribution of µ in (M, π) depends solely on the distribution of the sequence (X n ). The situation concerning the unweighted CLT for lacunary sequences can now be summarized by the following theorem.
Theorem 1.1 Let (X n ) be a determining sequence of r.v.'s with limit random measure µ. Then there exists a subsequence (X n k ) satisfying, together with all of its subsequences, the CLT ( The sufficiency part of the theorem is contained in Aldous'general subsequence theorems in [1]; the necessity was proved in our recent paper [7]. Note that the condition for the CLT for lacunary subsequences of (X n ) is given in terms of the limit random measure of (X n ) and this condition is the exact analogue of the condition in the i.i.d. case, only the common distribution of the i.i.d. variables is replaced by the limit random measure. Note also that the existence of second moments of (X n ) (or the existence of any moments) is not necessary for the conclusion of Theorem 1.1.
In this paper we investigate the analogous question in case of a nonnormal stable limit distribution, i.e. the question under what conditions a sequence (X n ) of r.v.'s has a subsequence (X n k ) whose weighted partial sums, suitably normalized, converge weakly to an α-stable distribution, 0 < α < 2. Let, for c > 0 and 0 < α < 2, G α,c denote the distribution function with characteristic function exp(−c|t| α ) and let S = S(α, c) denote the class of symmetric distributions on R with characteristic function ϕ satisfying Our main result is Theorem 1.2 Let 0 < α < 2, c > 0 and let (X n ) be a determining sequence of r.v.'s with limit random measure µ. Assume that µ ∈ S(α, c) with probability 1. Then there exists a subsequence (X n k ) such that for any real sequence (a k ) satisfying we have Condition (1.9) holds provided the corresponding (symmetric) distribution func- (See Berkes and Dehling [4], Lemma 3.2.) Apart from the monotonicity condition, this is equivalent to the fact that F is in the domain of normal attraction of a symmetric stable distribution. (See e.g. Feller [14], p. 581.) It is natural to ask if the conclusion of Theorem 1.2 remains valid (with a suitable centering factor) assuming only that µ ∈ S a.s. where S denotes the domain of normal attraction of a fixed stable distribution. From the theory in [1] it follows that the answer is affirmative in the unweighted case a k = 1, but in the uniform weighted case the question remains open. Symmetry plays no essential role in the proof of Theorem 1.2; it is used only in Lemma 2.2 and at the cost of minor changes in the proof, (1.9) can be replaced by a condition covering nonsymmetric distributions as well. But since we do not know the optimal condition, we restricted our investigations to the case (1.9) where the technical details are the simplest and the idea of the proof becomes the most transparent. Given a sequence (X * n ) of r.v.'s and a random measure µ defined on a probability space (Ω, F, P ) such that X * n are conditionally i.i.d. given µ with conditional distribution µ, the limit random measure of (X * n ) is easily seen to be µ. The sequence (X * n ) is exchangeable, so passing to subsequences does not change its asymptotic properties, so if µ ∈ S(α, c) a.s., then the conclusion of Theorem 1.2 holds for the whole sequence (X * n ) without passing to any subsequence. (This follows directly also from Lemma 2.2.) Theorem 1.2 shows that any deterministic sequence (X n ) with a limit random measure µ satisfying µ ∈ S(α, c) a.s. has a subsequence (X n k ) whose weighted partial sums behave, in a uniform sense, similarly to those of (X * n ).
2 Proof of Theorem 1.2 As the first step of the proof, we select a sequence n 1 < n 2 < . . . of integers such that, after a suitable discretization of (X n ), we have P (X n k ∈ J|X n 1 , . . . , X n k−1 )(ω) −→ µ(ω, J) a.s.
for a large class of intervals J. This step follows exactly Aldous [1], see Proposition 11 of [1] for details. Let (Y n ) be a sequence of r.v.'s on (Ω, F, P ) such that, given X and µ, the r.v.'s Y 1 , Y 2 , . . . are conditionally i.i.d. with distribution µ, i.e., for any j, k and Borel sets B, B 1 , . . . , B k on the real line. Such a sequence (Y n ) always exists after redefining (X n ) and µ on a suitable, larger probability space; for example, one can define the triple ((X n ), µ, (Y n )) on the product space R ∞ × M × R ∞ as done in [1], p. 72. This redefinition will not change the distribution of the sequence (X n ) and thus by Proposition 2.1 of [6] it remains determining. Since the random measure µ depends on the variables X n themselves and not only on the distribution of (X n ), this redefinition will change µ, but not the joint distribution of (X n ) and µ on which our results depend. Using (2.1) and a martingale argument, in [1], Lemma 12 it is shown that Lemma 2.1 For every σ(X)-measurable r.v. Z and any j ≥ 1 we have We now construct a further subsequence of (X n k ) satisfying the conclusion of Theorem 1.2. By reindexing our variables, we can assume that Lemma 2.1 holds with n k = k. For our construction we need some auxiliary considerations. For a (nonrandom) measure µ ∈ S(α, c), the corresponding characteristic function ϕ satisfies where β is a bounded continuous function on R with β(0) = 0. Given µ 1 , µ 2 ∈ S(α, c) with characteristic functions ϕ 1 , ϕ 2 and corresponding functions β 1 , β 2 in (2.4), define Clearly, ρ satisfies the triangle inequality and if ρ(µ 1 , µ 2 ) = 0, then ϕ 1 (t) = ϕ 2 (t) for all t ∈ R and thus µ 1 = µ 2 . Hence, ρ is a metric on S(α, c). If µ, µ 1 , µ 2 , . . . ∈ S(α, c) with corresponding characteristic functions ϕ, ϕ 1 , ϕ 2 , . . . and functions β, β 1 , β 2 , . . ., then ρ(µ n , µ) → 0 implies that β n (t) → β(t) and consequently ϕ n (t) → ϕ(t) uniformly on compact intervals and thus µ n d → µ. Conversely, if µ n d → µ, then ϕ n (t) → ϕ(t) uniformly on compact intervals and thus β n (t) → β(t) uniformly on compact intervals not containing 0. Note that lim t→0 β n (t) = 0 for any fixed n by the definition of S(α, c); if this relation holds uniformly in n, then β n (t) → β(t) will hold uniformly also on all compact intervals containing 0 and upon observing that (2.4) implies |β(t)| ≤ |t| −α |ϕ(t) − 1| + c ≤ c + 2 for |t| ≥ 1 and thus the total contribution of the terms of the sum in (2.5) for k ≥ M is ≤ 4(c + 2)2 −M , it follows that ρ(µ n , µ) → 0. Thus if for a class H ⊂ S(α, c) we have lim t→0 β(t) = 0 uniformly for all functions β corresponding to measures in H, then in H convergence of elements in Prohorov metric and in the metric ρ are equivalent.
Let now ϕ(t) = ϕ(t, ω) denote the characteristic function of the random measure µ = µ(ω). By the assumption µ ∈ S(α, c) a.s. of Theorem 1.2, we have where lim t→0 β(t, ω) = 0 a.s. Let ξ n (ω) = sup |t|≤1/n |β(t, ω)|, then lim n→∞ ξ n (ω) = 0 a.s. and thus by Egorov's theorem (see [13]) for any ε > 0 there exists a measurable set A ⊂ Ω with P (A) ≥ 1 − ε such that lim n→∞ ξ n (ω) = 0 and consequently lim t→0 β(t, ω) = 0 uniformly on A. Considering A as a new probability space, we will show that there exists a subsequence (X n k ) (depending on A) satisfying the conclusion of Theorem 1.2 together with all its subsequences. By a diagonal argument we can get then a subsequence (X n k ) satisfying the conclusion of Theorem 1.2 on the original Ω. Thus without loss of generality we can assume in the sequel that the function β(t, ω) in (2.6) satisfies lim t→0 β(t, ω) = 0 uniformly in ω ∈ Ω and thus by the remarks in the previous paragraph, in the support of the random measure µ the Prohorov metric and the metric ρ generate the same convergence.
Proof. Letting ϕ 1 , ϕ 2 denote the characteristic function of the Z k 's resp. Z * k 's and using (2.4), (1.10) and the inequality n k=1 valid for all |x k | ≤ 1, |y k | ≤ 1 we get that for |t|δ n ≤ 1 the left hand side of (2.7) equals n k=1 Remark. The proof of Lemma 2.2 shows that for any t ∈ R the left hand side of (2.7) cannot exceed |t| α sup |x|≤|t|δn |β 1 (x) − β 2 (x)|, a fact that will be useful in the sequel.
Given probability measures ν n , ν on the Borel sets of a separable metric space (S, d) we say, as usual, that ν n for every bounded, real valued continuous function f on S.
where Z n , Z are r.v.'s valued in (S, d) (i.e. measurable maps from some probability space to (S, d)) with distribution ν n , ν.
Lemma 2.3 (see [21]). Let (S, d) be a separable metric space and let ν, ν 1 , ν 2 , . . . be probability measures on the Borel sets of (S, d) such that ν n d −→ ν. Let G be a class of real valued functions on (S, d) such that (a) G is locally equicontinuous, i.e. for for every ε > 0 and x ∈ S there is a δ = δ(ε, x) > 0 such that y ∈ S, d(x, y) ≤ δ imply |f (x) − f (y)| ≤ ε for every f ∈ G. (b) There exists a continuous function g ≥ 0 on S such that |f (x)| ≤ g(x) for all f ∈ G and x ∈ S and S g(x)dν n (x) −→ S g(x)dν(x) (< ∞) as n → ∞. (2.10) Assume now that (X n ) satisfies the assumptions of Theorem 1.2, fix t ∈ R and for any n ≥ 1, (a 1 , . . . , a n ) ∈ R n let ψ(a 1 , . . . , a n ) = E exp itA −1 where A n = ( n k=1 |a k | α ) 1/α and (Y k ) is the sequence of r.v.'s defined before Lemma 2.1. We show that for any ε > 0 there exists a sequence n 1 < n 2 < · · · of integers such that a i X n i ≤ (1 + ε)ψ(a 1 , . . . , a k ) (2.13) for all k ≥ 1 and all (a k ) satisfying (1.10); moreover, (2.13) remains valid for every further subsequence of (X n k ) as well. To construct n 1 we set for every n ≥ 1, ℓ ≥ 2 and a = (a 1 , . . . , a ℓ ) ∈ R ℓ . We show that as n → ∞ uniformly in a, ℓ. (2.14) (The right side of (2.14) equals 1.) To this end we recall that, given X and µ, the r.v.'s Y 1 , Y 2 , . . . are conditionally i.i.d. with common conditional distribution µ and thus, given X, µ and Y 1 , the r.v.'s Y 2 , Y 3 , . . . are conditionally i.i.d. with distribution µ. Thus E Q(a, n, ℓ)|X, µ = g a,ℓ (X n , µ) (2.15) and and (ξ Recall that ρ is a metric on S = S(α, c); the remarks at the beginning of this section show that on the support of µ the metric ρ and the Prohorov metric π induce the same convergence and thus the same Borel σ-field; thus the limit random measure µ, which is a random variable taking values in (S, π), can be also regarded as a random variable taking values in (S, ρ). Also, µ is clearly σ(X) measurable and thus (X n , µ) defined on the product metric space (R × S , λ × ρ) (λ denotes the ordinary distance on R) satisfies conditions (a),(b) of Lemma 2.3. To see the validity of (a) let us note that by (2.2), (2.3), Y n are conditionally i.i.d. with respect to µ with conditional distribution µ, moreover, we assumed without loss of generality that the characteristic function ϕ(t, ω) of µ(ω) satisfies (2.6) with lim t→0 β(t, ω) = 0 uniformly in ω and thus applying Lemma 2.2 with ϕ 1 (t) = ϕ(t, ω) and ϕ 2 (t) = exp(−c|t| α ) and using (1.10) and the remark after the proof of Lemma 2.2 it follows that there exists an integer n 0 and a positive constant c 0 such that ψ(a) ≥ c 0 for n ≥ n 0 and all (a k ). Thus the validity of (a) follows from Lemma 2.2; the validity of (b) is immediate from |g a,ℓ (u, ν)| ≤ 1. We thus proved relation (2.19) and thus also (2.14), whence it follows (note again that the right side of (2.14) equals 1) that as n → ∞, uniformly in ℓ, a. Hence given ε > 0, we can choose n 1 so large that for every ℓ, a and n ≥ n 1 . This completes the first induction step. Assume now that n 1 , . . . , n k−1 have already been chosen. Exactly in the same way as we proved (2.21), it follows that for ℓ > k uniformly in a and ℓ. Hence we can choose n k > n k−1 so large that E exp itA −1 ℓ (a 1 X n 1 + · · · + a k−1 X n k−1 + a k X n + a k+1 Y k+1 + · · · + a ℓ Y ℓ ) − E exp itA −1 ℓ (a 1 X n 1 + · · · + a k−1 X n k−1 + a k Y k + · · · + a ℓ Y ℓ ) (2.23) ≤ ε 2 k ψ(a 1 , . . . , a ℓ ) for every (a 1 , . . . , a ℓ ) ∈ R ℓ , ℓ > k and n ≥ n k . This completes the k-th induction step; the so constructed sequence (n k ) obviously satisfies E exp itA −1 ℓ (a 1 X n 1 + · · · + a ℓ X n ℓ ) − E exp itA −1 ℓ (a 1 Y 1 + · · · + a ℓ Y ℓ ) ≤ εψ(a 1 , . . . , a ℓ ) for every ℓ ≥ 1 and (a 1 , . . . , a ℓ ) ∈ R ℓ , i.e. (2.13) is valid. Since in the k-th induction step n k was chosen in such a way that the corresponding inequalities (2.22) (for k = 1) and (2.23) (for k > 1) hold not only for n = n k , but for all n > n k as well, relation (2.13) remains valid for any further subsequence of (X n k ).
To complete the proof of our theorem, it suffices to show that for any t ∈ R and any real sequence (a k ) satisfying (1.10) we have Together with (2.13) and the fact that (2.13) remains valid for any further subsequence of (X n k ) as well, this implies that for any ε > 0 and t ∈ R there exists an increasing sequence (n k ) of positive integers (depending on ε and t) such that for any further subsequence (n ′ k ) of (n k ) we have for any k ≥ k 0 (ε, t) and any (a k ) satisfying (1.10). By a diagonal argument this shows that there exists a sequence (m k ) satisfying, together all of its subsequences, the relation for any rational t ∈ R and any (a k ) satisfying (1.10), which implies that completing the ptoof of Theorem 1.2. To verify (2.24), let us note that conditionally on (X, µ), Y j are i.i.d. with conditional characteristic function ϕ satisfying (1.9), which implies, in view of the remark after the proof of of Lemma 2.2, that setting S k = k j=1 a j Y j , E exp itA −1 k S k |X, µ −→ exp(−c|t| α ). (2.25) Integrating the last relation and using the dominated convergence theorem we get (2.24).