GCD sums from Poisson integrals and systems of dilated functions

Upper bounds for GCD sums of the form [\sum_{k,{\ell}=1}^N\frac{(\gcd(n_k,n_{\ell}))^{2\alpha}}{(n_k n_{\ell})^\alpha}] are proved, where $(n_k)_{1 \leq k \leq N}$ is any sequence of distinct positive integers and $0<\alpha \le 1$; the estimate for $\alpha=1/2$ solves in particular a problem of Dyer and Harman from 1986, and the estimates are optimal except possibly for $\alpha=1/2$. The method of proof is based on identifying the sum as a certain Poisson integral on a polydisc; as a byproduct, estimates for the largest eigenvalues of the associated GCD matrices are also found. The bounds for such GCD sums are used to establish a Carleson--Hunt-type inequality for systems of dilated functions of bounded variation or belonging to $\lip12$, a result that in turn settles two longstanding problems on the a.e.\ behavior of systems of dilated functions: the a.e. growth of sums of the form $\sum_{k=1}^N f(n_k x)$ and the a.e.\ convergence of $\sum_{k=1}^\infty c_k f(n_kx)$ when $f$ is 1-periodic and of bounded variation or in $\lip12$.


Introduction
This paper studies two closely related topics: Greatest common divisor (GCD) sums of the form (gcd(n k , n ℓ )) 2α (n k n ℓ ) α for 0 < α ≤ 1 and convergence properties of systems of dilated functions f (n k x) on the unit interval [0, 1]. Here (n k ) k≥1 is a sequence of distinct positive integers and f is a 1periodic real-valued function of bounded variation or belonging to the class Lip 1/2 . We will introduce a new method for estimating sums of the form (1) and in particular solve a problem posed by Dyer and Harman in [14]. In addition, using estimates for (1), we will establish a version of the Carleson-Hunt inequality that settles two longstanding problems regarding the a.e. behavior of systems of dilated functions. The study of GCD sums like (1) was initiated by Koksma who in the 1930s observed that such sums can be used to estimate integrals of the form where the notation {·} stands for fractional part. Integrals like (2) give in turn important information about the distribution of the sequence ({n k x}) k≥1 for almost all x ∈ (0, 1). In the case α = 1, Gál [18] proved that 1 (3) 1 N N k,ℓ=1 (gcd(n k , n ℓ )) 2 n k n ℓ ≤ c(log log N) 2 , and he showed that this bound is optimal up to the value of the absolute constant c. In 1986, Dyer and Harman [14] proved that (4) 1 N N k,ℓ=1 gcd(n k , n ℓ ) √ n k n ℓ ≤ C exp c log N log log N for two absolute constants C and c, and they used this estimate to prove results in metric Diophantine approximation; Dyer and Harman found also that 1 N N k,ℓ=1 (gcd(n k , n ℓ )) 2α (n k n ℓ ) α ≤ c(α) exp (log N) (4−4α)/ (3−2α) for 1/2 < α < 1. In his monograph [22], Harman writes that "it is tempting to conjecture" that the right-hand side of (4) can be replaced by a constant times exp c √ log N/log log N . One of our examples given below will disprove this conjecture and show that here we can not have a function smaller than exp 2 (log N)/log log N . However, the following theorem, which is our main result on GCD sums, will "almost" confirm Harman's conjecture and yield optimal upper bounds for (1) when 1/2 < α < 1.
Theorem 1 is in fact a corollary to a more general result which can be given a function theoretic interpretation on the infinite-dimensional polydisc D ∞ . The observation underlying this general theorem is that the GCD sum (1) can be written as a certain Poisson integral evaluated at the point (p −α j ) in D ∞ , where p j denotes the j-th prime number. Such integrals can be computed for arbitrary points in D ∞ , and our theorem is roughly speaking stated in this generality. The proof requires a surprising blend of an intricate combinatorial argument found in Gál's work [18] and the explicit expression for the Poisson kernel on polydiscs. Thus number theory plays a minor role in establishing Theorem 1 and enters the discussion only at the final point, where we need information about the decay of the sequence (p −α j ). We will show by an example that Theorem 1 is best possible (up to a constant factor in the exponent) when 1/2 < α < 1. We will also see that the blow-up of the constant in front of the leading term in g(α, N) is of the right magnitude when α ր 1. We conjecture that the blow-up of the same constant when α ց 1/2 is an artifact and that the estimate in the range 1/2 < α < 1 should indeed extend to α = 1/2, which would then be optimal too. On the other hand, as we will see, the estimates change abruptly when we pass from α = 1/2 to α < 1/2, as a consequence of the divergence of the series p −2α j ; the slow divergence when α = 1/2 is the reason why this is a particularly delicate case. The range 0 < α < 1/2, included here for the sake of completeness, is less subtle, and it is easy to give an example showing that the estimate of Theorem 1 is essentially best possible.
The proof of Theorem 1 and the examples showing that our results are essentially optimal will be presented in Section 3 below. An immediate consequence of our reformulation in terms of Poisson integrals is that the corresponding matrices are positive definite. In the subsequent Section 4, we will see that in turn Theorem 1 implies precise estimates for the largest eigenvalues of these matrices, or, equivalently, for their spectral norms.

Applications to systems of dilated functions
Our main application of Theorem 1, to be found in Section 5 below, will be to establish a Carleson-Hunt-type inequality for systems of dilated functions of bounded variation or belonging to Lip 1/2 . By standard arguments, this inequality will yield asymptotically precise results for the growth of and for the almost everywhere convergence of for functions f of bounded variation or belonging to Lip 1/2 that satisfy (7) f (x + 1) = f (x), Such dilated sums arise in many problems in analytic number theory, Diophantine approximation, uniform distribution theory, harmonic analysis, ergodic theory, and probability theory. Estimating the sum (5) for centered indicator functions f = f a,b = χ (a,b) − (b − a), which are extended with period 1, is equivalent to measuring the uniformity (more precisely the deviation from uniformity) of the distribution of the sequence (n k x) k≥1 modulo 1, and for n k = k very precise results are known. Khinchin [29] proved that the discrepancy of the sequence (kx) 1≤k≤N satisfies for every ε > 0 and that this becomes false for ε = 0. Here the discrepancy D N (x 1 , . . . , x N ) of a sequence x 1 , . . . , x N of real numbers is defined as where again f a,b denotes the centered indicator function of the interval (a, b) ⊂ [0, 1], extended with period 1. Thus we have uniformly for such centered indicators f a,b , and, in view of Koksma's inequality (see e.g. [31], p. 143), uniformly for all 1-periodic functions f satisfying (7) and Var [0,1] (f ) ≤ 1. In view of Schmidt's lower bound [37] for the discrepancy of arbitrary infinite sequences, the metric discrepancy behavior of (kx) k≥1 is near to extremal. For general (n k ) k≥1 , the situation changes markedly. For f (x) = 2χ [0,1/2) (x)−1 (extended to R with period 1) and n k = 2 k , the terms of (5) reduce to the Rademacher functions, and the law of the iterated logarithm implies that for almost all x ∈ (0, 1) the sum (5) exceeds (N log log N) 1/2 for infinitely many N. Berkes and Philipp [6] constructed a sequence (n k ) k≥1 such that for f (x) = {x} − 1/2 and for almost all x the relation holds for infinitely many N, providing an even faster growing sum (5). In the opposite direction, R.C. Baker [3] showed, improving earlier results of Cassels [12] and Erdős and Koksma [15], that for every increasing sequence (n k ) k≥1 of integers, the discrepancy of the sequence (n k x) 1≤k≤N satisfies (10) D N (n 1 x, ..., n N x) ≪ N −1/2 (log N) 3/2+ε a.e.
for every ε > 0. As a consequence, we have uniformly for all f satisfying (7) and Var [0,1] (f ) ≤ 1. There is a gap between (9) and (11); in particular it is not known if the uniform estimate (11) holds for ε = 0 and all (n k ) k≥1 .
For a fixed f ∈ BV (i.e. without uniformity), Aistleitner, Mayer, and Ziegler [2] improved the upper bound in (11) to getting for the first time a bound better than O( √ N(log N) 3/2 ). (Here, and in the sequel, we write f ∈ BV if Var [0,1] f < ∞.) Our Carleson-Hunt-type inequality will give the following improvement of this estimate.
Theorem 2. Let (n k ) k≥1 be a strictly increasing sequence of positive integers, let f be a function satisfying (7), and assume in addition that either f ∈ BV or f ∈ Lip 1/2 . Then for every ε > 0, a.e.
This estimate is sharp up to the exact value of the exponent of log log N, as shown by the following result of Berkes and Philipp [6, Theorem 1]: There exists an increasing sequence (n k ) k≥1 such that The class Lip 1/2 represents an interesting limiting case in this context. Kaufman and Philipp [28] proved that, under the lacunarity condition n k+1 /n k ≥ q > 1 (k = 1, 2, . . .), the law of the iterated logarithm holds uniformly for all f ∈ Lip α , α > 1/2, with a fixed Lipschitz constant, and this fails for α < 1/2. The case α = 1/2 remains open. In the case of Theorem 2, the proof shows that for f ∈ Lip α , α > 1/2, the exponent 5/2 in (12) can be replaced by 1/2 and this exponent is best possible. The second consequence of our version of the Carleson-Hunt inequality deals with the a.e. convergence of series of the form (14) ∞ k=1 c k f (n k x) for 1-periodic functions f . By Carleson's theorem [11], when f (x) = sin 2πx or f (x) = cos 2πx, the series (14) converges a.e. provided that ∞ k=1 c 2 k < ∞. Gaposhkin [20] showed that this remains valid if the Fourier series of f converges absolutely; in particular, this holds if f belongs to the class Lip α for some α > 1/2. However, Nikishin [36] showed that the analogue of Carleson's theorem fails for f (x) = sgn sin 2πx, and it also fails for some continuous function f . There is an extensive literature on this convergence problem going back to the 1940s (see [7] and [19] for the history of the subject), and sufficient a.e. convergence criteria have been obtained for various classes of functions such as Lip α , 0 < α ≤ 1/2, L p , BV , or spaces of functions defined via decay conditions on Fourier coefficients, see e.g. [1,7,8,9,10,19,21,38]. However, except for Carleson's theorem and its immediate consequences, no precise a.e. convergence criteria for the series (14) have been found. The following theorem gives an essentially complete solution to the convergence problem for BV and a substantial improvement of known results for the class Lip 1/2 .
Theorem 3. Let f be a function satisfying (7) and assume in addition that either f ∈ BV or f ∈ Lip 1/2 . Let (c k ) k≥1 be a real sequence satisfying for some γ > 4. Then for every increasing sequence (n k ) k≥1 of positive integers the series ∞ k=1 c k f (n k x) converges a.e. Using the optimality of Gál's theorem and a probabilistic argument, we will in Section 6 show that for every 0 < γ < 2 there exists an increasing sequence (n k ) k≥1 of positive integers and a real sequence (c k ) k≥1 such that (15) Thus apart from the precise value of the exponent of log log k, Theorem 3 is best possible for f ∈ BV. In the Lip 1/2 case, the argument in Section 6 gives a slightly weaker counterexample, with log log k in (15) replaced by log log log k. On the other hand, in the case of f ∈ Lip α , 0 < α < 1/2, Theorem 3 of [5] gives an a.e. divergent series (6) Comparing this result with Theorem 3, we see that there is an essential difference between the convergence behavior of the sum (5) for α = 1/2 and α < 1/2. We conclude again that Lip 1/2 stands out as a particularly interesting limiting case. We mention finally two additional applications of Theorem 1. First, we may obtain a substantial improvement of the convergence criteria in [1] and [38] for the case 0 < α < 1/2; we will discuss this problem in a subsequent paper. Second, Theorem 1 yields an improvement of a result of Harman [24] on metric Diophantine approximation. The effect of replacing the estimate (4) in Harman's original proof by our Theorem 1 is that a factor of order exp (c log N/ log log N) becomes instead a factor of order exp c √ log N log log N . This result is connected with the Duffin-Schaeffer conjecture, a notoriously difficult open problem from metric Diophantine approximation (see [22,23]).

Proof of Theorem 1 via trigonometric polynomials on D ∞
We introduce multi-index notation suitable for our purposes. A multi-index is a sequence β = (β (1) , β (2) , ..., β (R) , 0, 0, ...) consisting of nonnegative integers with only a finite number of them being nonzero. We let supp β be the finite set of positive integers j for which β (j) > 0; we write R(β) for the maximal element in supp β. Two multi-indices β and µ may be added and subtracted as sequences. Then β − µ may fail to be a multi-index, but the sequence |β − µ| = (|β (j) − µ (j) |) will again be a multi-index. We may multiply multi-indices by positive integers in the obvious way and express any multi-index as a linear combination of the natural basis elements e j , where e j is the multi-index supported by {j} with e (j) For a sequence of complex numbers z = (z j ), we use the notation we will sometimes write z −β for the number (z β ) −1 .
We write p = (p j ) for the sequence of prime numbers ordered by ascending magnitude. Using our multi-index notation, we may write every positive integer n as p β for a multiindex β that is uniquely determined by n. If n k = p β k , then we may write For an arbitrary sequence t of positive numbers in D ∞ and a set of distinct multi-indices where the supremum is taken over all possible sets B of distinct multi-indices β 1 , ..., β N . Our original problem concerning GCD sums has thus been transformed into the problem of estimating Γ t (N) in the particular case when t = (p −α j ). For a minor technical reason, we introduce the following notation. Let η : (0, 1) → (0, 1) be defined by the relation and for a sequence t = (t j ) with 0 < t j < 1, we set η(t) := (η(t j )). For a decreasing sequence t of positive numbers in the sequence space c 0 , we define We will prove the following general theorem. Theorem 4. Let t = (t j ) be a sequence of positive numbers in D ∞ ∩c 0 such that (τ j ) := η(t) is a decreasing sequence. Fix a positive number ξ > (log 2) −1 , and set r N = [ξ log N] + κ(t). Then, for arbitrary numbers where C is a positive constant depending only on ξ.
This theorem is clearly applicable when the sequence t is in ℓ 2 , but it can also be used when the series j t 2 j is "slowly" divergent, as we will now see.
Proof of Theorem 1. We now take Theorem 4 for granted and show that it implies Theorem 1. We begin with the case 1/2 < α < 1 and observe first that then for some constant c. This inequality has the consequence that the exponential term in (16) will contribute only with a fixed constant factor, independent of ε, in C ε . Assuming that N is so large that (2α in the first term on the right-hand side of (16), with τ j = η(p −α j ). (The decay of τ j is a minor technical point which can be dealt with by an obvious rearrangement of the sequence. For smaller N, we set v j := τ 0 for all j. We choose ξ = 2 and note that p −α j < 1/2 for j ≥ 3, whence we have τ j = 2p −α j for j ≥ 3 and r N = [2 log N] + 2. We set and split accordingly the first product into two factors. Hence, using the definition of τ j , we obtain if j 0 and thus s N are large enough, with C an absolute constant. By the prime number theorem, we have p j = (1 + o(1))j log j when j → ∞, so that (17) and (18) become (19) can be estimated as whence we finally get assuming again that j 0 is sufficiently large.
For the second product in (16), we obtain We appeal again to the prime number theorem and get The desired estimate for the function g(α, n) in Theorem 1 follows from our three estimates (21), (20), and (22), if we take into account that the contribution from the factors omitted in the first product in (16) by the restriction on j 0 can be bounded by a constant C ε which is independent of α.
The case α = 1/2 is dealt with in the same way, the only difference being that we now choose v j = max(η(p −1/2 j ), (log log N) 1/2 /(log N) 1/2 ). Retaining the notation from the preceding case and assuming that j 0 is large enough, we get respectively where we in the last step used Mertens's second theorem. Combining these estimates, we arrive at the required bound for g(1/2, N) since we may assume that N is so large that log log N ≥ 1.
To see to what extent Theorem 1 is sharp for 1/2 ≤ α < 1, we consider the following example: Set N = 2 r and take n 1 , ..., n N to be all square-free numbers composed of the first r primes. Then (gcd(n k , n ℓ )) 2α which follows from an argument in [18, p. 21]. By the prime number theorem, we therefore get for some positive constant c. Thus our estimate in Theorem 1 is of the right order of magnitude when 1/2 < α < 1, as is the blow-up of the multiplicative constant 1/(1 − α) in t g(α, N) when α ր 1. However, this example does not settle the cases α ց 1/2 and α = 1/2. In fact, we see that there is a discrepancy of a factor log log N in the exponent between our estimate and the lower bound obtained from the example. It seems likely that the blow-up of the constant c(α) when α ց 1/2 is an artifact. The trouble is that the divergence of the series j p −1 j implies that the number of primes involved in the sum plays a role. We believe the number of primes should be O(log N) when the sum is maximal, but can only infer from our method of proof that this number is bounded by N − 1.
Our estimate is however essentially optimal when 0 < α < 1/2. To see this, it suffices to consider the example n 1 = 2, n 2 = 3, ..., n N = p N . Using the prime number theorem in a similar way as in the proof of Theorem 1, we obtain that for a positive constant c. The reason for the abrupt change at α = 1/2 is that the relatively fast divergence of j p −2α j (as in this example) plays a dominant role when 0 < α < 1/2. We will now prepare for the proof of Theorem 4 by making the passage to Poisson integrals as alluded to above. We let σ K denote normalized Lebesgue measure on the unit polycircle T K and write which is the Poisson kernel for the unit polydisc D K at the point ζ. It is convenient in this definition to allow ζ to be a point in the infinite-dimensional polydisc D ∞ . The only property of P K needed is the identity valid for positive sequences t in D ∞ , which is obtained by computing the integral over T K as an iterated integral over K copies of the unit circle. It leads immediately to the following lemma.
Lemma 1. For a positive sequence t in D ∞ , arbitrary multi-indices β 1 , ..., β N with K = max j R(β j ), and complex numbers c 1 , ..., c N , we have The fact that the quadratic form on the left-hand side of (23) can be written as the square of a norm was first observed in [34] in the special case when t = (p −α j ) and α > 1/2, based on ideas from [25]. The present formulation seems more illuminating and leads to an interesting problem for trigonometric polynomials on D ∞ . We will take a closer look at this problem in the next section, where we will estimate the ℓ 2 -norm of the quadratic form on the left-hand side of (23), or, in other words, the largest eigenvalue of the matrix (t |β k −β ℓ | ).
For the proof of Theorem 4, we only need (23) when c k ≡ 1. Incidentally, this restriction is crucial for the combinatorial argument that leads to Lemma 2 below, which is our next auxiliary result. It is interesting to note that this lemma relies on the left-hand side of (23), while the subsequent analytic part of the proof of Theorem 4 departs from the right-hand side of this identity.
We will use a variant of Gál's terminology: A set B of N multi-indices β 1 , ..., β N is said to be κ-canonical for 0 ≤ κ < N if β ∈ B and e j ≤ β for some j with κ < j ≤ N imply that β − e j ∈ B. The following lemma is a modification of a theorem in [18, p. 17].
Lemma 2. Suppose B is a set of N multi-indices. Let t be a decreasing sequence of positive numbers in D ∞ ∩ c 0 . If κ(t) < N, then there exists a κ(t)-canonical set of N multi-indices Proof. We will modify B and t by an inductive algorithm. We break the argument into two parts, the first of which will give a set of multi-indices for which the union of their supports has cardinality at most N − 1.
Part 1 : It will be convenient to use the following terminology. We say that a multi-index β in B is j-maximal if j is in supp β but (β (j) + 1)e j ≤ µ for every µ in B. We will construct from B a new setB with the property that if β inB is j-maximal, then also β − e j is iñ B, while at the same time S(t,B) ≥ S(t, B). WritingB = {β 1 , ...,β N }, we see that, as a consequence, we will have # N j=1 suppβ j ≤ N − 1.
Fix a positive integer j in k supp β k . Let ν be the largest integer such that νe j ≤ β for some β in B. Suppose there is a j-maximal multi-index β in B such that νe j ≤ β but β − e j is not in B. For every such β, we replace β in B by β − e j ; we call the new set of multi-indices B ν . A term by term comparison shows that S(t, B ν ) ≥ S(t, B).
If there is a j-maximal multi-index in B ν with β (j) = µ, then it must have the desired property that also β − e j is in B ν , and no further action is needed. In the opposite case, we repeat the argument with ν replaced by ν − 1. The iteration terminates when either the desired property holds for some B η with 1 ≤ η ≤ ν or j is not in the support of any multi-index in B 1 .
We repeat this iteration for every j in k supp β k and obtain thus the desired setB.
Part 2 : By part 1, we may from now on assume that, for every j in k supp β k , any j-maximal multi-index β in B has the property that β − e j is in B. This is irrelevant for the argument to be given below, but we need it to reach the desired conclusion about the cardinality of j supp β j . We now assume that κ(t) < N. We fix a j > κ(t) in j supp β j and divide B into disjoint subsets b 1 , ..., b ℓ (1 ≤ ℓ ≤ N), which we call j-chains of multi-indices, according to the following rule: two distinct multi-indices β and µ belong to the same j-chain b if |β − µ| = ηe j for some η > 0. This means that every element β in b is of the form β = µ + ηe j , where µ (j) = 0 and µ is thus a multi-index that characterizes the j-chain b. We now modify each j-chain b k by replacing it by the set b k := {µ, µ + e j , ..., µ + (#b − 1)e j }, and we setB := ℓ k=1b k . It is immediate that S(t,b) ≥ S(t, b). To compare the terms of the sum corresponding to pairs of multi-indices from different j-chains, we introduce the notation where a and b are two different j-chains. Sorting, by descending order of magnitude, the possible values of |β (j) − µ (j) | for all β and µ in a and b and inã andb, respectively, we obtain the inequality This implies that S(t; a, b) ≤ S(t + t j e j ; a, b) and, more generally, that S(t + t j e j ,B) ≥ S (t, B).
The result follows if we make this modification in turn for every j in k supp β k for which j > κ(t).
It is clear that we may assume that N j=1 supp β j = {1, 2, ..., K} for some K ≤ N − 1 since we are seeking an upper bound for all sums S(τ, B) and τ is a decreasing sequence. Note that we may write By Lemma 1 and the orthonormality of the monomials z β , we therefore get Let B 1 denote the set of those multi-indices β such that R(β) ≤ K and # supp β ≤ r N , and let B 2 denote the set of all other multi-indices β with R(β) ≤ K. By the Cauchy-Schwarz inequality, we get which may be written as Since B is assumed to be κ(t)-canonical, # supp β j ≤ (log N)/ log 2 + κ(t) for every j, and hence # supp(β − β j ) ≥ ε log N for a positive ε, depending on our choice of ξ, when β is in B 2 . We assume for convenience that ε log N is an integer. Suppose 2τ 2 j > e −1/ε for j = 1, ..., J ≤ N − 1. Then we may estimate the inner sum as an Euler product and obtain β∈B 2 which means that for a constant C that only depends on ε. We next consider the summation over B 1 . Let β be an arbitrary multi-index in this set with supp β = {j 1 , ..., j i }, where i ≤ r N by the definition of B 1 . For any numbers v k satisfying the hypothesis of Theorem 4, we define a sequence w β by requiring We now apply the Cauchy-Schwarz inequality and get Now summing over β in B 1 and changing the order of summation, we get Since (v j ) is a nonincreasing sequence, we have Plugging this estimate into the right-hand side of (26) and estimating the sum over β ∈ B 1 in terms of an Euler product, we conclude that We finally observe that, in view of (24), this inequality along with the preceding estimate (25) leads to the desired inequality (16).
It is worth pointing out that the most essential use of Lemma 2 was to reduce the problem to the case when the cardinalities # supp β j are uniformly bounded by a constant times log N. It would be desirable to find a way to arrive at this reduction without involving the auxiliary sequence η(t). In particular, if this could be done, then our method of proof would allow us to recapture Gál's theorem (3). Unfortunately, we may only conclude from Theorem 4 that Γ (p −1 j ) (N) ≪ (log log N) 4 .

Spectral norms of generalized GCD matrices
This section will show that we with little extra effort may obtain from Theorem 4 precise estimates for the largest eigenvalues of the matrices (t |β k −β ℓ | ), which we will refer to as generalized GCD matrices. Since, by (23), these matrices are positive definite, we see that is the least upper bound for these eigenvalues, where the suprema are taken over respectively all N-tuples of distinct multi-indices β 1 , ...β N and all nonzero vectors c = (c 1 , ..., c N ) in C N . We may also refer to Λ t (N) as the supremum of the spectral norms of the matrices (t |β k −β l | ) for fixed N. The problem of estimating Λ t (N) for t = (p −α j ) was raised in [7, p. 10]. Based on purely arithmetical arguments, Hilberdink [26, pp. 362-363] gave precise estimates for the spectral norms of our GCD matrices in the special case when p β j = j or, in other words, for the matrix corresponding to the first N integers.
Trivially, Λ t (N) ≥ Γ t (N). In the opposite direction, we have the following estimate.
Theorem 5. We have is a decreasing sequence of positive numbers in D ∞ .
A few remarks are in order before we give the proof of this theorem. First, the result is of interest only when t fails to be in ℓ 1 because if t is in ℓ 1 , then the easy estimate (27) Λ t (N) ≤ which can be obtained from the right-hand side of (23), will be uniformly bounded when N → ∞. Note that a special version of this estimate is given in [34, p. 152]. We will prove both (27) and a corresponding estimate for the smallest eigenvalue of (t |β k −β ℓ | ) at the end of this section, as a generalization of the result in [34, p. 152]. In our terminology, Dyer and Harman [14] obtained (4) from the estimate Besides the results of [34] and [14], we are not aware of previous estimates of Λ t (N) for any other values of t. If we combine Theorem 1 with Theorem 5, then we obtain precise estimates for Λ (p −α j ) (N) when 0 < α < 1. From Gál's theorem (3) and Theorem 5 we also get Λ (p −1 j ) (N) ≤ c(log N)(log log N) 2 for an absolute constant c. A more subtle application of our estimates for GCD sums, to be given in the next section, will lead to the better bound Λ (p −1 j ) (N) ≪ (log log N) 4 . An interesting point is that this improved estimate is obtained from Theorem 1 and does not require Gál's theorem.
As an application of our result on spectral norms, we note that we may replace λ N in Theorem 1.1 of [7, p. 10] by our quantity Λ (p −α j ) (N) and then improve Corollary 1.2 of [7, p. 11] significantly by using our estimates for Λ (p −α j ) (N). The phenomenon captured by Theorem 4 and Theorem 5 is interesting from a function theoretic point of view: While holomorphic polynomials F of fixed L 2 norm (in terms of their coefficients) are uniformly bounded at any fixed point in D ∞ ∩ ℓ 2 [13], this is not so in general for the Poisson integrals of |F | 2 . Indeed, the two theorems give a surprisingly precise statement about the relation between the growth of the number of monomials involved in the polynomials and the growth of such Poisson integrals at points ζ in the complement of D ∞ ∩ ℓ 1 . We believe it could be of interest to clarify how these estimates relate to the distributional properties of polynomial chaos as studied for instance in [32].
Finally, we would like to emphasize the striking point that the combinatorial Lemma 2 seems indispensable in the deduction of our estimates for the spectral norms.
Proof of Theorem 5. We will estimate the quadratic form N k,ℓ=1 t |β k −β ℓ | c k c ℓ for arbitrary multi-indices β 1 , ..., β N and vectors c = (c 1 , ..., c N ) satisfying N j=1 |c j | 2 = 1. We may clearly assume that the coefficients c j are nonnegative. Set By the Cauchy-Schwarz inequality, we get Using (23) and again the Cauchy-Schwarz inequality, we get Applying (23) a second time, we also obtain which, by the definition of C ℓ and the fact that c is a unit vector, implies Returning to (28) and making a final application of (23), we obtain the desired result Let now λ t (N) denote the infimum of the smallest eigenvalues of the generalized GCD matrices (t |β k −β l | ) for fixed N. We obtain then the following generalization of the theorem in [34, p. 152].
Theorem 6. We have is a decreasing sequence of positive numbers in D ∞ .
Proof. Note first that the expressions to the left and to the right are respectively the minimum and the maximum of P N −1 (t, z) when z varies over T N −1 . Thus the estimates in (29) follow from (23) if we first make the observation that it suffices to integrate over an (N − 1)-circle to compute the L 2 (σ K )-norm of a function of the form N j=1 c j z β j .

A Carleson-Hunt-type inequality
We have now come to our main application of Theorem 1, namely to establish a Carleson-Hunt-type inequality. To this end, we will require the following special case of the classical Carleson-Hunt inequality [27, Theorem 1]. Our generalized version of this inequality reads as follows (as in the introduction we write f ∈ BV for a function which has bounded variation on [0, 1]).

Lemma 4.
For every function f satisfying (7) and either f ∈ BV or f ∈ Lip 1/2 , there exists a constant c such that the following holds. For every finite and strictly increasing sequence of positive integers (n k ) 1≤k≤N and every associated finite sequence of real numbers (c k ) 1≤k≤N , we have We do not know whether the exponent of log log N is optimal in (30), but the following argument shows that it can not be smaller than 2 for f in BV: If we choose f (x) = {x}−1/2, then we have the identity which has been first stated by Franel [17] and first proved by Landau [33]. Consequently for this particular function f the left-hand side of (30) exceeds (gcd(n k , n ℓ )) 2 n k n ℓ c k c ℓ .
By the optimality of Gál's theorem (3), we know that Λ (p −1 j ) (N) ≫ (log log N) 2 in the terminology of the preceding section, and therefore 2 is a lower bound for the exponent. This can also be seen from Hilberdink's computation of the spectral norm of the GCD matrix ((gcd(m, n)) 2 /(m, n)) N m,n=1 (see [26]). The argument just given also shows that Lemma 4 implies that Λ (p −1 j ) (N) ≪ (log log N) 4 , as announced in the preceding section. Since the maximal operator appearing in Lemma 4 is not needed in the computation of the spectral norm, one may suspect that we could do better if our sole goal was to estimate Λ (p −1 j ) (N). However, the proof given below does not give any better bound if we remove the maximal operator on the left-hand side of (30).
Before turning to the proof of Lemma 4, we introduce the following conventions. We write c for appropriate positive constants, not always the same, which may depend on f , but not on N or anything else. Any additional dependence is made explicit; we may sometimes, for example, write c(ε) instead of c. We will use the notation where g is assumed to be a real-valued function.
Proof of Lemma 4. Let f be any function satisfying (7), and assume that either f ∈ BV or f ∈ Lip 1/2 . To simplify the exposition, we assume that f is even so that its Fourier series is a pure cosine-series: Under the assumption that k c 2 k = 1, the coefficients c k satisfying |c k | ≤ N −2 will give a negligible contribution to the left-hand side of our maximal inequality. We may therefore assume without loss of generality that N −2 ≤ |c k | ≤ 1.
To make our proof as transparent as possible, we will first prove Lemma 4 when f ∈ BV. The proof for f ∈ Lip 1/2 is technically more involved and will be given subsequently. In what follows, we will use the notation Proof in the case f ∈ BV : By [39, p. 48], the Fourier coefficients a j of a function f in BV satisfy where J will be chosen later. Then, by Minkowski's inequality, By (31) and Lemma 3, we have Estimating the second term on the right-hand side of (33) is more difficult. Let arbitrary numbers 0 ≤ M 1 < M 2 ≤ N be given. We want to find a good estimate for We now sort the coefficients by size in the same way as we did in the proof of Theorem 5. Hence, for every ℓ in {0, ⌈2 log N⌉}, we define (36) K ℓ := k : M 1 < k ≤ M 2 and e −ℓ−1 < |c k | ≤ e −ℓ .
As observed above, we may assume that N −2 ≤ |c k | ≤ 1 for 1 ≤ k ≤ N. Thus Now let an arbitrary ℓ in {0, ⌈2 log N⌉} be fixed, and set N ℓ := #K ℓ . By (31) and the orthogonality of the trigonometric system, we have Let v < w be two positive integers. Then, following an argument of Koksma [30], we have On the other hand, as in [2, p. 104], we have Let 0 < ε < 1 be a number to be chosen later. Combining (38) and (39), we obtain Thus the integral in (37) is bounded by which, by Theorem 1 (for α = 1 − ε/2), is at most By Minkowski's inequality, we therefore get the following estimate for (35): Applying the Cauchy-Schwarz inequality, we infer from this bound that The constantĉ in (41) is marked byˆto indicate that its value (unlike the value of the other constants denoted by c) does not change in the sequel. Without loss of generality, we may assume thatĉ ≥ 4. We now choose J by requiring that Now imitating the proof of the Rademacher-Menshov inequality (see [35, p. 123]), we see that this estimate implies Choosing ε = 1/(log log N) and recalling thatĉ ≥ 4, we see that the expression in (43) will be bounded by c( N k=1 c 2 k ) 1/2 . On the other hand, Proof in the case f ∈ Lip 1/2 : If f ∈ Lip 1/2 , then by [39, p. 241] we have Note that if f ∈ BV, then (45) also holds as a consequence of (31); thus the proof for the case f ∈ BV could have been included in the present proof. However, (45) is significantly weaker than (31), which makes the proof in the present case more complicated. By the Cauchy-Schwarz inequality, (45) implies that  (32), with J to be chosen later. We estimate the second term on the right-hand side of (33). To this end, assume that 0 < ε < 1, and set Let 0 ≤ M 1 < M 2 ≤ N be given, and let µ denote the largest integer such that 2 µ ≤ J. Replacing all coefficients by their absolute values (which is permitted due to the orthogonality of the trigonometric system), starting the summation at 2 µ instead of J and applying Minkowski's inequality twice we get We reverse the order of summation and use Minkowski's inequality along with (47), (45), and the orthogonality of the trigonometric system to estimate the second norm on the right-hand side of this inequality. Using also the definition of S m to deal with the first norm, we therefore get: We finally turn to the example showing that Theorem 3 is essentially best possible for the class BV. In what follows, we will use the notation ϕ(x) := {x} − 1/2. Our arguments will be probabilistic and we will use the symbols P and E with respect to the unit interval equipped with Borel sets and the Lebesgue measure.
Theorem 7. For every 0 < γ < 2, there exists an increasing sequence (n k ) k≥1 of positive integers and a real sequence (c k ) k≥1 such that ∞ k=1 c 2 k (log log k) γ < ∞, but ∞ k=1 c k ϕ(n k x) is a.e. divergent. We will need the following variant of Lemma 2 of [5].
Since p m+1 ≥ 16q m and since each k ∈ I m+1 is a multiple of 2 p m+1 , each interval U j in (54) is a period interval for all ϕ(kx), k ∈ I m+1 and thus also for ξ k , k ∈ I m+1 . Hence Y m+1 is independent of the σ-field F m , and since F 1 ⊂ F 2 ⊂ . . . and Y m is F m -measurable, the random variables Y 1 , Y 2 , . . . are independent. Finally Eξ k = 0 and thus EY m = 0.
Proof of Theorem 7. We will actually prove a little more than what is stated in the theorem: we show that for any positive sequence ε k → 0 there exists an increasing sequence (n k ) k≥1 of integers and a real sequence (c k ) k≥1 such that ∞ k=1 c 2 k (log log k) 2 ε k < ∞ and