Lecture Note
University:
Massachusetts Institute of TechnologyCourse:
18.785 | Number Theory IAcademic year:
2021
Views:
141
Pages:
296
Author:
Dolimanomnjgg
1 and all > 0 we have |F (λx) − F (x)| < for all sufficiently large x. Fix λ > 1 and suppose there is an unbounded sequence (xn ) such that f (xn ) ≥ λxn for all n ≥ 1. For each xn we have Z λxn Z λxn Z λ f (t) − t λxn − t λ−t F (λxn ) − F (xn ) = dt ≥ dt = dt = c, 2 2 t t t2 xn xn 1 for some c > 0, where we used the fact that f is non-decreasing to get the middle inequality. Taking < c, we have |F (λxn ) − F (xn )| = c > for arbitrarily large xn , a contradiction. Thus f (x) < λx for all sufficiently large x. A similar argument shows that f (x) > λ1 x for all sufficiently large x. These inequalities hold for all λ > 1, so limx→∞ f (x)/x = 1. Equivalently, f (x) ∼ x. 5 The equality sign in the big-O notation f (x) = O(g(x)) is a standard abuse of notation; it simply means lim supx→∞ |f (x)|/|g(x)| < ∞ (and nothing more). In more complicated equalities a big-O expression should P be interpreted as a set of functions, one of which makes the equality true, for example, n≥1 n1 = log n+O(1). 18.785 Fall 2021, Lecture #16, Page 5 In order to show that the hypothesis of Lemma 16.8 is satisfied for f = ϑ, we will work with the function H(t) = ϑ(et )e−t − 1; the change of variables t = eu shows that Z ∞ Z ∞ ϑ(t) − t H(u)du converges . dt converges ⇐⇒ t2 1 0 We now recall the Laplace transform. Definition 16.9. Let h : R>0 → R be a piecewise continuous function. The Laplace transform Lh of h is the complex function defined by Z ∞ Lh(s) := e−st h(t)dt, 0 which is holomorphic on Re(s) > c for any c ∈ R for which h(t) = O(ect ). The following properties of the Laplace transform are easily verified. • L(g + h) = Lg + Lh, and for any a ∈ R we have L(ah) = aLh. • If h(t) = a ∈ R is constant then Lh(s) = as . • L(eat h(t))(s) = L(h)(s − a) for all a ∈ R. We now define the auxiliary function Φ(s) := X p−s log p, p which is related to ϑ(x) by the following lemma. Lemma 16.10. L(ϑ(et ))(s) = Φ(s) s is holomorphic on Re(s) > 1. Proof. By Lemma 16.7, ϑ(et ) = O(et ), so L(ϑ(et )) is holomorphic on Re(s) > 1. Let pn be the nth prime, and put p0 := 0. The function ϑ(et ) is constant on t ∈ (log pn , log pn+1 ), so Z log pn+1 Z log pn+1 1 −s e−st ϑ(et )dt = ϑ(pn ) e−st dt = ϑ(pn ) p−s − p n n+1 . s log pn log pn We then have (Lϑ(et ))(s) = Z ∞ e−st ϑ(et )dt = 0 ∞ 1X −s ϑ(pn ) p−s − p n n+1 s n=1 ∞ ∞ 1X 1X = ϑ(pn )p−s ϑ(pn−1 )p−s n − n s s = = 1 s 1 s n=1 ∞ X n=1 ∞ X n=1 ϑ(pn ) − ϑ(pn−1 ) p−s n p−s n log pn = n=1 Φ(s) . s Let us now consider the function H(t) := ϑ(et )e−t − 1. It follows from the lemma and standard properties of the Laplace transform that on Re(s) > 0 we have LH(s) = L(ϑ(et )e−t )(s) − (L1)(s) = L(ϑ(et ))(s + 1) − 1 Φ(s + 1) 1 = − . s s+1 s 18.785 Fall 2021, Lecture #16, Page 6 Lemma 16.11. The function Φ(s) − that is holomorphic on Re(s) ≥ 1. 1 s−1 extends to a meromorphic function on Re(s) > 1 2 Proof. By Theorem 16.3, ζ(s) extends to a meromorphic function on Re(s) > 0, which we also denote ζ(s), that has only a simple pole at s = 1 and no zeros on Re(s) ≥ 1, by Corollary 16.5. It follows that the logarithmic derivative ζ 0 (s)/ζ(s) of ζ(s) is meromorphic on Re(s) > 0, with no zeros on Re(s) ≥ 1 and only a simple pole at s = 1 with residue −1 (see §16.3.1 for standard facts about the logarithmic derivative of a meromorphic function). In terms of the Euler product, for Re(s) > 1 we have6 !0 !0 Y X 0 ζ 0 (s) − = − log ζ(s) = − log (1 − p−s )−1 = log(1 − p−s ) ζ(s) p p X log p X 1 X p−s log p 1 = + log p = = 1 − p−s ps − 1 ps ps (ps − 1) p p p X log p = Φ(s) + . ps (ps − 1) p The sum on the RHS converges absolutely and locally uniformly to a holomorphic function on Re(s) > 1/2. The LHS is meromorphic on Re(s) > 0, and on Re(s) ≥ 1 it has only a 1 simple pole at s = 1 with residue 1. It follows that Φ(s) − s−1 extends to a meromorphic 1 function on Re(s) > 2 that is holomorphic on Re(s) ≥ 1. 1 Corollary 16.12. The functions Φ(s + 1) − 1s and (LH)(s) = Φ(s+1) s+1 − s both extend to meromorphic functions on Re(s) > − 12 that are holomorphic on Re(s) ≥ 0. Proof. The first statement follows immediately from the lemma. For the second, note that Φ(s + 1) 1 1 1 1 − = Φ(s + 1) − − s+1 s s+1 s s+1 is meromorphic on Re(s) > − 21 and holomorphic on Re(s) ≥ 0, since it is a sum of products of such functions. The final step of the proof relies on the following analytic result due to Newman [8]. Theorem 16.13. Let f : R≥0 → R be a bounded piecewise continuous function, and suppose its R ∞Laplace transform extends to a holomorphic function g(s) on Re(s) ≥ 0. Then the integral 0 f (t)dt converges and is equal to g(0). Proof. Without loss of generality weR assume f (t) ≤ 1 for all t ≥ 0. For τ ∈ R>0 , define Rτ ∞ gτ (s) := 0 f (t)e−st dt, By definition 0 f (t)dt = limτ →∞ gτ (0), thus it suffices to prove lim gτ (0) = g(0). τ →∞ For r > 0, let γr be the boundary of the region {s : |s| ≤ r and Re(s) ≥ −δr } with δr > 0 chosen so that g is holomorphic on γr ; such a δr exists because g is holomorphic on Re(s) ≥ 0, hence on some open ball B≤2δ(y) (iy) for each y ∈ [−r, r], and we may take 6 As is standard when computing logarithmic derivatives, we are taking the principal branch of the complex logarithm and can safely ignore the negative real axis where it is not defined since we are assuming Re(s) > 1. 18.785 Fall 2021, Lecture #16, Page 7 δr := inf{δ(y) : y ∈ [r, −r]}, which is positive because [−r, r] is compact. Each γr is a 2 simple closed curve, and for each τ > 0 the function h(s) := (g(s) − gτ (s))esτ (1 + rs2 ) is holomorphic on a region containing γr . Using Cauchy’s integral formula (Theorem 16.26) to evaluate h(0) yields Z 1 1 s (2) g(0) − gτ (0) = h(0) = g(s) − gτ (s) esτ + 2 ds. 2πi γr s r We will show the LHS tends to 0 as τ → ∞ by showing that for any > 0 we can set r = 3/ > 0 so that the absolute value of the RHS is less than for all sufficiently large τ . Let γr+ denote the part of γr in Re(s) > 0, a semicircle of radius r. The integrand is absolutely bounded by 1/r on γr+ , since for |s| = r and Re(s) > 0 we have sτ g(s) − gτ (s) · e 1 s + 2 s r = 1 2πi Z γr+ Z ∞ τ ∞ f (t)e−st dt · eRe(s)τ r s · + r s r eRe(s)τ 2 Re(s) · r r τ Re(s)τ − Re(s)τ e e 2 Re(s) = · · Re(s) r r 2 = 2/r . ≤ Therefore Z e− Re(s)t dt · 1 s 2 1 1 sτ g(s) − gτ (s) e + 2 ds ≤ · πr · 2 = s r 2π r r (3) Now let γr− be the part of γr in Re(s) < 0, a truncated semi-circle. For any fixed r, the first term g(s)esτ (s−1 + sr−2 ) in the integrand of (2) tends to 0 as τ → ∞ for Re(s) < 0 and |s| ≤ r. For the second term we note that since gτ (s) is holomorphic on C, it makes no difference if we instead integrate over the semicircle of radius r in Re(s) < 0. For |s| = r and Re(s) < 0 we then have gτ (s)e sτ 1 s + 2 s r = Z Z τ 0 τ f (t)e−st dt · eRe(s)τ r s · + r s r eRe(s)τ (−2 Re(s)) r r 0 ! − Re(s)τ Re(s)τ e e (−2 Re(s)) = 1− Re(s) r r ≤ e− Re(s)t dt · = 2/r2 · (1 − eRe(s)τ Re(s)), where the factor (1 − eRe(s)τ Re(s)) on the RHS tends to 1 as τ → ∞ since Re(s) < 0. We thus obtain the bound 1/r + o(1) when we replace γr+ with γr− in (3), and the RHS of (2) is bounded by 2/r + o(1) as τ → ∞. It follows that for any > 0, for r = 3/ > 0 we have |g(0) − gτ (0)| < 3/r = for all sufficiently large τ . Therefore limτ →∞ gτ (0) = g(0) as desired. 18.785 Fall 2021, Lecture #16, Page 8 Remark 16.14. Theorem 16.13 is an example of what is known as a Tauberian theorem. For a piecewise continuous function f : R≥0 → R, its Laplace transform Z ∞ Lf (s) := e−st f (t)dt, 0 is typically not defined on Re(s) ≤ c, where c is the least c for which f (t) = O(ect ). Now it may happen that the function Lf has an analytic continuation to a larger domain; for 1 example, if f (t) = et then (Lf )(s) = s−1 extends to a holomorphic function on C−{1}. But plugging values of s with Re(s) ≤ c into the integral usually does not work; in our f (t) = et example, the integral diverges on Re(s) ≤ 1. The theorem says that when Lf extends to a holomorphic function on the entire half-plane Re(s) ≥ 0, its value at s = 0 is exactly what we would get by simply plugging 0 into the integral defining Lf . More generally, Tauberian theorems refer to results related toRtransforms f → T (f ) that ∞ allow us to deduce properties of f (such as the convergence of 0 f (t)dt) from properties of T (f ) (such as analytic continuation to Re(s) ≥ 0). The term “Tauberian" was coined by Hardy and Littlewood and refers to Alfred Tauber, who proved a theorem of this type as a partial converse to a theorem of Abel. Theorem 16.15 (Prime Number Theorem). π(x) ∼ x log x . Proof. H(t) = ϑ(et )e−t − 1 is piecewise continuous and bounded, by Lemma 16.7, and its Laplace transform extends to a holomorphic function on Re(s) ≥ 0, by Corollary 16.12. Theorem 16.13 then implies that the integral Z ∞ Z ∞ H(t)dt = ϑ(et )e−t − 1 dt 0 0 converges. Replacing t with log x, we see that Z ∞ Z ∞ 1 dx ϑ(x) − x ϑ(x) − 1 = dx x x x2 1 1 converges. Lemma 16.8 implies ϑ(x) ∼ x, equivalently, π(x) ∼ x log x , by Theorem 16.6. One disadvantage of our proof is that it does not give us an error term. Using more sophisticated methods, Korobov [6] and Vinogradov [14] independently obtained the bound ! x , π(x) = Li(x) + O exp (log x)3/5+o(1) in which we note that the error term is bounded by O(x/(log x)n ) for all n but not by O(x1− ) for any > 0. Assuming the Riemann Hypothesis, which states that the zeros of ζ(s) in the critical strip 0 < Re(s) < 1 all lie on the line Re(s) = 12 , one can prove π(x) = Li(x) + O(x1/2+o(1) ). More generally, if we knew that ζ(s) has no zeros in the critical strip with real part greater than c, for some c ≥ 1/2 strictly less than 1, we could prove π(x) = Li(x) + O(xc+o(1) ). There thus remains a large gap between what we can prove about the distribution of prime numbers and what we believe to be true. Remarkably, other than refinements to the o(1) term appearing in the Korobov-Vinogradov bound, essentially no progress has been made on this problem in the last 60 years. 18.785 Fall 2021, Lecture #16, Page 9 16.3 A quick recap of some basic complex analysis The complex numbers C are a topological field under the distance metric d(x, y) = |x − y| √ induced by the standard absolute value |z| := z z̄, which is also a norm on C as an Rvector space; all references to the topology on C (open, compact, convergence, limits, etc.) are made with this understanding. 16.3.1 Glossary of terms and standard theorems Let f and g denote complex functions defined on an open subset of C. • f is differentiable at z0 if limz→z0 f (z)−f (z0 ) z−z0 exists. • f is holomorphic at z0 if it is differentiable on an open neighborhood of z0 . • f is analytic at z0 if there of z0 in which f can be defined by P is an open neighborhood n a power series f (z) = n=0 an (z − z0 ) ; equivalently, f is infinitely differentiable and has a convergent Taylor series on an open neighborhood of z0 . • Theorem: f is holomorphic at z0 if and only if it is analytic at z0 . • Theorem: If C is a connected set containing a nonempty open set U and f and g are holomorphic on C with f|U = g|U , then f|C = g|C . • With U and C as above, if f is holomorphic on U and g is holomorphic on C with f|U = g|U , then g is the (unique) analytic continuation of f to C and f extends to g. • If f is holomorphic on a punctured open neighborhood of z0 and |f (z)| → ∞ as z → z0 then z0 is a pole of f ; note that the set of poles of f is necessarily a discrete set. • f is meromorphic at z0 if it is holomorphic at z0 or has z0 as a pole. • Theorem: at z0 then it can be defined by a Laurent series P If f is meromorphic n f (z) = n≥n0 an (z − z0 ) that converges on an open punctured neighborhood of z0 . • The order of vanishing ordz0 (f ) of a nonzero function f that is meromorphic at z0 is the least index n of the nonzero coefficients an in its Laurent series expansion at z0 . Thus z0 is a pole of f iff ordz0 (f ) < 0 and z0 is a zero of f iff ordz0 (f ) > 0. • If ordz0 (f ) = 1 then z0 is a simple zero of f , and if ordz0 (f ) = −1 it is a simple pole. • The residue resz0 (f ) of a function P f meromorphic at z0 is the coefficient a−1 in its Laurent series expansion f (z) = n≥n0 an (z − z0 )n at z0 . • Theorem: If z0 is a simple pole of f then resz0 (f ) = limz→z0 (z − z0 )f (z). • Theorem: If f is meromorphic on a set S then so is its logarithmic derivative f 0 /f , and f 0 /f has only simple poles in S and resz0 (f 0 /f ) = ordz0 (f ) for all z0 ∈ S. In particular the poles of f 0 /f are precisely the zeros and poles of f . 16.3.2 Convergence P P Recall that a series ∞ n=1 an of complex numbers converges absolutely if the series n |an | of nonnegative real numbers converges. An equivalent definition is that the function a(n) := an is integrable with respect to the counting measure µ on the set of positive integers N. Indeed, if the series is absolutely convergent then Z ∞ X an = a(n)µ, n=1 N 18.785 Fall 2021, Lecture #16, Page 10 and if the series is not absolutely convergent, the integral is not defined. Absolute convergence is effectively built-in to the definition of the Lebesgue integral, which requires that in order for the function a(n) = x(n) + iy(n) to be integrable, the positive real functions |x(n)| and |y(n)| must both be integrable (summable), and separately computes sums of the positive and negative subsequences of (x(n)) and (y(n)) as suprema over finite subsets. The measure-theoretic perspective has some distinct advantages. It makes it immediately clear that we may replace the index set N with any set of the same cardinality, since the counting measure depends only on the cardinality of N, not its ordering. We are thus free to sum over any countable index set, including Z, Q, any finite product of countable sets, and any countable coproduct of countable sets (such as countable direct sums of Z); such sums are ubiquitous in number theory and many cannot be meaningfully interpreted as limits of partial sums in the usual sense, since this assumes that the index set is well ordered (not the case with Q, for example). The measure-theoretic view makes P also makes it clear that we may convert any absolutely convergent sum• of the form X×Y into an iterated sum P P theorem. X Y (or vice versa), via Fubini’s Q We say that an infinite product is absolutely conn an of nonzero P Q complex numbers P vergent when the sum n log an is, in which case n an := exp( n log an ).7 This implies that an absolutely convergent product cannot converge to zero, and the sequence (an ) must converge to 1 (no matter how we order the an ). All of our remarks above about absolutely convergent series apply to absolutely convergent products as well. A series or product of complex functions fn (z) is absolutely convergent on S if the series or product of complex numbers fn (z0 ) is absolutely convergent for all z0 ∈ S. Definition 16.16. A sequence of complex functions (fn ) converges uniformly on S if there is a function f such that for every > 0 there is an integer N for which supz∈S |fn (z)−f (z)| < for all n ≥ N . The sequence (fn ) converges locally uniformly on S if every z0 ∈ S has an open neighborhood U for which (fn ) converges uniformly on U ∩S. When applied to a series of functions these terms refer to the sequence of partial sums. Because C is locally compact, locally uniform convergence is the same thing as compact convergence: a sequence of functions converges locally uniformly on S if and only if it converges uniformly on every compact subset of S. Theorem 16.17. A sequence or series of holomorphic functions fn that converges locally uniformly on an open set U converges to a holomorphic function f on U , and the sequence or series of derivatives fn0 then converges locally uniformly to f 0 (and if none of the fn has a zero in U and f 6= 0, then f has no zeros in U ). Proof. See [3, Thm. III.1.3] and [3, Thm. III.7.2]. P Definition 16.18. n (z) converges normally on a set S P P A series of complex functions n fP if n kfn k := n supz∈S |fn (z)| converges. The series n fnP (z) converges locally normally on S if every z0 ∈ S has an open neighborhood U on which n fn (z) converges normally. Theorem 16.19 (Weierstrass M-test). Every locally normally convergent series of P functions converges absolutely and locally uniformly. Moreover, a series n fn of holomorphic functions on converges locally normally converges to a holomorphic function f PS that 0 on S, and then n fn converges locally normally to f 0 . 7 In this definition we use the principal branch of log z := log |z| + i Arg z with Arg z ∈ (−π, π). 18.785 Fall 2021, Lecture #16, Page 11 Proof. See [3, Thm. III.1.6]. P Remark 16.20. To show a series n fn is locally normally convergent on a set S amounts to proving that for every z0 ∈ S there is an open neighborhood P U of z0 and a sequence of real numbers (Mn ) such that |fn (z)| ≤ Mn for z ∈ U ∩ S and n Mn < ∞, whence the term “M -test". 16.3.3 Contour integration We shall restrict our attention to integrals along contours defined by piecewise-smooth parameterized curves; this covers all the cases we shall need. Definition 16.21. A parameterized curve is a continuous function γ : [a, b] → C whose domain is a compact interval [a, b] ⊆ R. We say that γ is smooth if it has a continuous nonzero derivative on [a, b], and piecewise-smooth if [a, b] can be partitioned into finitely many subintervals on which the restriction of γ is smooth. We say that γ is closed if γ(a) = γ(b), and simple if it is injective on [a, b) and (a, b]. Henceforth we will use the term curve to refer to any piecewise-smooth parameterized curve γ, or to its oriented image of in the complex plane (directed from γ(a) to γ(b)), which we may also denote γ. Definition 16.22. Let f : Ω → C be a continuous function and let γ be a curve in Ω. We define the contour integral Z f (z)dz := γ Z b f (γ(t))γ 0 (t)dt, a whenever the integralR on the RHS (which is defined as a Riemann sum in the usual way) converges. Whether γ f (z)dz converges, and if so, to what value, does not depend on the parameterization of γ: ifR γ 0 is another parameterized curve with the same (oriented) image R as γ, then γ 0 f (z)dz = γ f (z)dz. We have the following analog of the fundamental theorem of calculus. Theorem 16.23. Let γ : [a, b] → C be a curve in an open set Ω and let f : Ω → C be a holomorphic function Then Z f 0 (z)dz = f (γ(b)) − f (γ(a)). γ Proof. See [2, Prop. 4.12]. Recall that the Jordan curve theorem implies that every simple closed curve γ partitions C into two components, one of which we may unambiguously designate as the interior (the one on the left as we travel along our oriented curve). We say that γ is contained in an open set U if both γ and its interior lie in U . The interior of γ is a simply connected set, and if an open set U contains γ then it contains a simply connected open set that contains γ. Theorem 16.24 (Cauchy’s Theorem). Let U be an open set containing a simple closed curve γ. For any function f that is holomorphic on U we have Z f (z)dz = 0. γ 18.785 Fall 2021, Lecture #16, Page 12 Proof. See [2, Thm. 8.6] (we can restrict U to a simply connected set). Cauchy’s theorem generalizes to meromorphic functions. Theorem 16.25 (Cauchy Residue Formula). Let U be an open set containing a simple closed curve γ. Let f be a function that is meromorphic on U , let z1 , . . . , zn be the poles of f that lie in the interior of γ, and suppose that no pole of f lies on γ. Then Z f (z)dz = 2πi γ n X reszi (f ). i=1 Proof. See [2, Thm. 10.5] (we can restrict U to a simply connected set). R it To see where the 2πi comes from, consider γ dz z with γ(t) = e for t ∈ [0, 2π]. In general one weights residues by a corresponding winding number, but the winding number of a simple closed curve about a point in its interior is always 1. Theorem 16.26 (Cauchy’s Integral Formula). Let U be an open set containing a simple closed curve γ. For any function f holomorphic on U and a in the interior of γ, Z 1 f (z) f (a) = dz. 2πi γ z − a Proof. Apply Cauchy’s residue formula to g(z) = f (z)/(z − a); the only poles of g in the interior of γ are a simple pole at z = a with resa (g) = f (a). Cauchy’s residue formula can also be used to recover the coefficients f (n) (a)/n! appearing in the Laurent series expansion of a meromorphic function at a (apply it to f (z)/(z −a)n+1 ). One of many useful consequences of this is Liouville’s theorem, which can be proved by showing that the Laurent series expansion of a bounded holomorphic function on C about any point has only one nonzero coefficient (the constant coefficient). Theorem 16.27 (Liouville’s theorem). Bounded entire functions are constant. Proof. See [2, Thm. 5.10]. We also have the following converse of Cauchy’s theorem. Theorem 16.28 (Morera’s Theorem). Let f be a continuous function and on an open set U , and suppose that for every simple closed curve γ contained in U we have Z f (z)dz = 0. γ Then f is holomorphic on U . Proof. See [3, Thm. II.3.5]. 18.785 Fall 2021, Lecture #16, Page 13 References [1] Lars V. Ahlfors, Complex analysis: an introduction to the theory of analytic functions of one complex variable, 3rd edition, McGraw-Hill, 1979. [2] Joseph Bak and Donald J. Newman, Complex analysis, Springer, 2010. [3] Rolf Busam and Eberhard Freitag, Complex analysis, 2nd edition, Springer 2009. [4] Paul Erdös, On a new method in elementary number theory which leads to an elementary proof of the prime number theorem, Proc. Nat. Acad. Scis. U.S.A. 35 (1949), 373–384. [5] Jacques Hadamard, Sur la distribution des zéros de la function ζ(s) et ses conséquences arithmétique, Bull. Soc. Math. France 24 (1896), 199–220. [6] Nikolai M. Korobov, Estimates for trigonometric sums and their applications, Uspechi Mat. Nauk 13 (1958), 185–192. [7] Serge Lange, Complex analysis, 4th edition, Springer, 1985. [8] David J. Newman, Simple analytic proof of the Prime Number Theorem, Amer. Math. Monthly 87 (1980), 693–696. [9] Charles Jean de la Vallée Poussin, Reserches analytiques sur la théorie des nombres premiers, Ann. Soc. Sci. Bruxelles 20 (1896), 183–256. [10] Bernhard Riemann, Über die Anzahl der Primzahlen unter einer gegebenen Grösse, Monatsberichte der Berliner Akademie, 1859. [11] Alte Selberg, An elementary proof of the Prime-Number Theorem, Ann. Math. 50 (1949), 305–313. [12] Elias M. Stein and Rami Shakarchi, Complex analysis, Princeton University Press, 2003. [13] Alfred Tauber, Ein Satz aus der Theorie der unendlichen Reihen, Monatsh f. Mathematik und Physik 8 (1897), 273–277. [14] Ivan M. Vinogradov, A new estimate of the function ζ(1 + it), Izv. Akad. Nauk SSSR. Ser. Mat. 22 (1958), 161–164. [15] Don Zagier, Newman’s short proof of the Prime Number Theorem, Amer. Math. Monthly 104 (1997), 705–708. 18.785 Fall 2021, Lecture #16, Page 14 18.785 Number theory I Lecture #16 16 Fall 2021 11/3/2021 Riemann’s zeta function and the prime number theorem We now divert our attention from algebraic number theory to talk about zeta functions and L-functions. As we shall see, every global field has a zeta function that is intimately related to the distribution of its primes. We begin with the zeta function of the rational field Q, which we will use to prove the prime number theorem. We will need some basic results from complex analysis, all of which can be found in any introductory textbook (such as [1, 2, 3, 7, 12]). A short glossary of terms and a list of the basic theorems we will use can be found at the end of these notes.1 16.1 The Riemann zeta function Definition 16.1. The Riemann zeta function is the complex function defined by the series X ζ(s) := n−s , n≥1 for Re(s) > 1, where n varies over positive integers. It is easy to verify that this series converges absolutely and locally uniformly on Re(s) > 1 (use the integral test on an open ball strictly to the right of the line Re(s) = 1). By Theorem 16.17, it defines a holomorphic function on Re(s) > 1, since each term n−s = e−s log n is holomorphic. Theorem 16.2 (Euler product). For Re(s) > 1 we have X Y ζ(s) = (1 − p−s )−1 , n−s = p n≥1 where the product converges absolutely. In particular, ζ(s) 6= 0 for Re(s) > 1. The product in the theorem above ranges over primes p. This is a standard practice in analytic number theory that we will follow: the symbol p always denotes a prime, and any sum or product over p is understood to be over primes, even if this is not explicitly stated. Proof. We have X n−s = XY p−vp (n)s = n≥1 p n≥1 YX p e≥0 p−es = Y p (1 − p−s )−1 . To justify the second equality, consider the partial zeta function ζm (s), which restricts the summation in ζ(s) to the set Sm of m-smooth integers (those with no prime factors p > m). If p1 , . . . , pk are the primes up to m, absolute convergence implies X Y X X Y ei ζm (s) := (pe11 · · · pekk )−s = (p−s ) = (1 − p−s )−1 . n−s = i e1 ,...,ek ≥0 n∈Sm 1≤i≤k ei ≥0 p≤m For any δ > 0 the sequence of functions ζm (s) converges uniformly on Re(s) > 1 + δ to ζ(s); indeed, for any > 0 and any such s we have |ζm (s) − ζ(s)| ≤ 1 X n≥m n−s ≤ X n≥m |n−s | = X n≥m n− Re(s) ≤ Z ∞ m 1 x−1−δ dx ≤ m−δ < , δ Those familiar with this material should still glance at §16.3.2 which touches on some convergence issues that are particularly relevant to number theoretic applications. for all sufficiently large m. It follows that the sequence ζQ m (s) converges locally uniformly to ζ(s) on Re(s) > 1. The sequence of functions Pm (s) := p≤m (1 − p−s )−1 clearly converges Q locally uniformly to (1 − p−s )−1 on any region in which the latter function is absolutely convergent (or even just convergent). For any s in Re(s) > 1 we have X p | log(1 − p−s )−1 | = X X1 X XX |p−s |e = (|ps | − 1)−1 < ∞, p−es ≤ e p p p e≥1 e≥1 P where we have used the identity log(1 − z) = − n≥1 n1 z n , valid for |z| < 1. It follows that Q −s −1 is absolutely convergent (and in particular, nonzero) on Re(s) > 1. p (1 − p ) Theorem 16.3 (Analytic continuation I). For Re(s) > 1 we have ζ(s) = 1 + φ(s), s−1 where φ(s) is a holomorphic function on Re(s) > 0. Thus ζ(s) extends to a meromorphic function on Re(s) > 0 that has a simple pole at s = 1 with residue 1 and no other poles. Proof. For Re(s) > 1 we have X 1 ζ(s) − = n−s − s−1 n≥1 Z 1 ∞ x −s Z X dx = n−s − n≥1 n+1 x −s n dx = XZ n+1 n≥1 n n−s − x−s dx. R n+1 For each n ≥ 1 the function φn (s) := n (n−s − x−s )dx is holomorphic on Re(s) > 0. For each fixed s in Re(s) > 0 and x ∈ [n, n + 1] we have Z x Z x Z x |s| |s| |s| −s −s −s−1 |n − x | = st dt ≤ dt = dt ≤ 1+Re(s) , s+1 1+Re(s) | n n n |t n t and therefore |φn (s)| ≤ Z n+1 n n−s − x−s dx ≤ |s| n1+Re(s) . For any s0 with Re(s0 ) > 0, if we put := Re(s0 )/2 and U := B< (s0 ), then for each n ≥ 1, |s0 | + =: Mn , n1+ s∈U P P and n Mn = (|s0 | + )ζ(1 + ) converges. The series n φn thus P converges locally normally on Re(s) > 0. By the Weierstrass M -test (Theorem 16.19), n φn converges to a function 1 that is holomorphic on Re(s) > 0. φ(s) = ζ(s) − s−1 sup |φn (s)| ≤ We now show that ζ(s) has no zeros on Re(s) = 1; this fact is crucial to the prime number theorem. For this we use the following ingenious lemma, attributed to Mertens.2 Lemma 16.4 (Mertens). For x, y ∈ R with x > 1 we have |ζ(x)3 ζ(x + iy)4 ζ(x + 2iy)| ≥ 1. 2 If this lemma strikes you as pulling a rabbit out of a hat, well, it is. For a slight variation, see [15, IV], which uses an alternative approach due to Hadamard. 18.785 Fall 2021, Lecture #16, Page 2 Proof. From the Euler product ζ(s) = log |ζ(s)| = − X p Q − p−s )−1 , we see that for Re(s) > 1 we have p (1 log |1 − p−s | = − since log |z| = Re log z and log(1 − z) = − X p P log |ζ(x + iy)| = Re log(1 − p−s ) = zn n≥1 n X X Re(p−ns ) p n≥1 n , for |z| < 1. Plugging in s = x + iy yields X X cos(ny log p) p n≥1 npnx , since Re(p−ns ) = p−nx Re(e−iny log p ) = p−nx cos(−ny log p) = p−nx cos(ny log p). Thus log |ζ(x)3 ζ(x + iy)4 ζ(x + 2iy)| = X X 3 + 4 cos(ny log p) + cos(2ny log p) p n≥1 npnx . We now note that the trigonometric identity cos(2θ) = 2 cos2 θ − 1 implies 3 + 4 cos θ + cos(2θ) = 2(1 + cos θ)2 ≥ 0. Taking θ = ny log p yields log |ζ(x)3 ζ(x + iy)4 ζ(x + 2iy)| ≥ 0, which proves the lemma. Corollary 16.5. ζ(s) has no zeros on Re(s) ≥ 1. Proof. We know from Theorem 16.2 that ζ(s) has no zeros on Re(s) > 1, so suppose ζ(1 + iy) = 0 for some y ∈ R. Then y 6= 0, since ζ(s) has a pole at s = 1, and we know that ζ(s) does not have a pole at 1 + 2iy 6= 1, by Theorem 16.3. We therefore must have lim |ζ(x)3 ζ(x + iy)4 ζ(x + 2iy)| = 0, (1) x→1 since ζ(s) has a simple pole at s = 1, a zero at 1 + iy, and no pole at 1 + 2iy. But this contradicts Lemma 16.4. 16.2 The Prime Number Theorem The prime counting function π : R → Z≥0 is defined by X π(x) := 1; p≤x it counts the number of primes up to x. The prime number theorem (PNT) states that π(x) ∼ x . log x The notation f (x) ∼ g(x) means limx→∞ f (x)/g(x) = 1; one says that f is asymptotic to g. This conjectured growth rate for π(x) dates back to Gauss and Legendre in the late 18th century. In fact Gauss believed the asymptotically equivalent but more accurate statement3 Z x dt . π(x) ∼ Li(x) := 2 log t 3 More accurate in the sense that |π(x) − Li(x)| grows more slowly than |π(x) − x | log x as x → ∞. 18.785 Fall 2021, Lecture #16, Page 3 However it was not until a century later that the prime number theorem was independently proved by Hadamard [5] and de la Vallée Poussin [9] in 1896. Their proofs are both based on the work of Riemann [10], who in 1860 showed that there is a precise connection between the zeros of ζ(s) and the distribution of primes (we shall say more about this later), but was unable to prove the prime number theorem. The proof we will give is more recent and due to Newman [8], but it relies on the same properties of the Riemann zeta function that were exploited by both Hadamard and de la Vallée, the most essential of which is the fact that ζ(s) has no zeros on Re(s) ≥ 1 (Corollary 16.5). A concise version of Newman’s proof by Zagier can be found in [15]; we will follow Zagier’s outline but be slightly more expansive in our presentation. We should note that there are also “elementary" proofs of the prime number theorem independently obtained by Erdös [4] and Selberg [11] in the 1940s that do not use the Riemann zeta function, but they are elementary only in the sense that they do not use complex analysis; the details of these proofs are considerably more complicated than the one we will give. Rather than work directly with π(x), it is more convenient to work with the log-weighted prime-counting function defined by Chebyshev4 X ϑ(x) := log p, p≤x whose growth rate differs from that of π(x) by a logarithmic factor. Theorem 16.6 (Chebyshev). π(x) ∼ x log x if and only if ϑ(x) ∼ x. Proof. We clearly have 0 ≤ ϑ(x) ≤ π(x) log x, thus ϑ(x) π(x) log x ≤ . x x For every ∈ (0, 1) we have ϑ(x) ≥ X log p ≥ (1 − )(log x) π(x) − π(x1− ) x1−
1 and all > 0 we have |F (λx) − F (x)| < for all sufficiently large x.
Fix λ > 1 and suppose there is an unbounded sequence (xn ) such that f (xn ) ≥ λxn for
all n ≥ 1. For each xn we have
Z λxn
Z λxn
Z λ
f (t) − t
λxn − t
λ−t
F (λxn ) − F (xn ) =
dt ≥
dt =
dt = c,
2
2
t
t
t2
xn
xn
1
for some c > 0, where we used the fact that f is non-decreasing to get the middle inequality.
Taking < c, we have |F (λxn ) − F (xn )| = c > for arbitrarily large xn , a contradiction.
Thus f (x) < λx for all sufficiently large x. A similar argument shows that f (x) > λ1 x
for all sufficiently large x. These inequalities hold for all λ > 1, so limx→∞ f (x)/x = 1.
Equivalently, f (x) ∼ x.
5
The equality sign in the big-O notation f (x) = O(g(x)) is a standard abuse of notation; it simply means
lim supx→∞ |f (x)|/|g(x)| < ∞ (and nothing more). In more complicated equalities a big-O
expression should
P
be interpreted as a set of functions, one of which makes the equality true, for example, n≥1 n1 = log n+O(1).
18.785 Fall 2021, Lecture #16, Page 5 In order to show that the hypothesis of Lemma 16.8 is satisfied for f = ϑ, we will work
with the function H(t) = ϑ(et )e−t − 1; the change of variables t = eu shows that
Z ∞
Z ∞
ϑ(t) − t
H(u)du converges .
dt converges
⇐⇒
t2
1
0
We now recall the Laplace transform.
Definition 16.9. Let h : R>0 → R be a piecewise continuous function. The Laplace transform Lh of h is the complex function defined by
Z ∞
Lh(s) :=
e−st h(t)dt,
0
which is holomorphic on Re(s) > c for any c ∈ R for which h(t) = O(ect ).
The following properties of the Laplace transform are easily verified.
• L(g + h) = Lg + Lh, and for any a ∈ R we have L(ah) = aLh.
• If h(t) = a ∈ R is constant then Lh(s) = as .
• L(eat h(t))(s) = L(h)(s − a) for all a ∈ R.
We now define the auxiliary function
Φ(s) :=
X
p−s log p,
p
which is related to ϑ(x) by the following lemma.
Lemma 16.10. L(ϑ(et ))(s) =
Φ(s)
s
is holomorphic on Re(s) > 1.
Proof. By Lemma 16.7, ϑ(et ) = O(et ), so L(ϑ(et )) is holomorphic on Re(s) > 1. Let pn be
the nth prime, and put p0 := 0. The function ϑ(et ) is constant on t ∈ (log pn , log pn+1 ), so
Z log pn+1
Z log pn+1
1
−s
e−st ϑ(et )dt = ϑ(pn )
e−st dt = ϑ(pn ) p−s
−
p
n
n+1 .
s
log pn
log pn
We then have
(Lϑ(et ))(s) =
Z
∞
e−st ϑ(et )dt =
0
∞
1X
−s
ϑ(pn ) p−s
−
p
n
n+1
s
n=1
∞
∞
1X
1X
=
ϑ(pn )p−s
ϑ(pn−1 )p−s
n −
n
s
s
=
=
1
s
1
s
n=1
∞
X
n=1
∞
X
n=1
ϑ(pn ) − ϑ(pn−1 ) p−s
n
p−s
n log pn =
n=1
Φ(s)
.
s
Let us now consider the function H(t) := ϑ(et )e−t − 1. It follows from the lemma and
standard properties of the Laplace transform that on Re(s) > 0 we have
LH(s) = L(ϑ(et )e−t )(s) − (L1)(s) = L(ϑ(et ))(s + 1) −
1
Φ(s + 1) 1
=
− .
s
s+1
s
18.785 Fall 2021, Lecture #16, Page 6 Lemma 16.11. The function Φ(s) −
that is holomorphic on Re(s) ≥ 1.
1
s−1
extends to a meromorphic function on Re(s) >
1
2
Proof. By Theorem 16.3, ζ(s) extends to a meromorphic function on Re(s) > 0, which we
also denote ζ(s), that has only a simple pole at s = 1 and no zeros on Re(s) ≥ 1, by
Corollary 16.5. It follows that the logarithmic derivative ζ 0 (s)/ζ(s) of ζ(s) is meromorphic
on Re(s) > 0, with no zeros on Re(s) ≥ 1 and only a simple pole at s = 1 with residue −1
(see §16.3.1 for standard facts about the logarithmic derivative of a meromorphic function).
In terms of the Euler product, for Re(s) > 1 we have6
!0
!0
Y
X
0
ζ 0 (s)
−
= − log ζ(s) = − log (1 − p−s )−1 =
log(1 − p−s )
ζ(s)
p
p
X log p
X 1
X p−s log p
1
=
+
log p
=
=
1 − p−s
ps − 1
ps ps (ps − 1)
p
p
p
X log p
= Φ(s) +
.
ps (ps − 1)
p
The sum on the RHS converges absolutely and locally uniformly to a holomorphic function
on Re(s) > 1/2. The LHS is meromorphic on Re(s) > 0, and on Re(s) ≥ 1 it has only a
1
simple pole at s = 1 with residue 1. It follows that Φ(s) − s−1
extends to a meromorphic
1
function on Re(s) > 2 that is holomorphic on Re(s) ≥ 1.
1
Corollary 16.12. The functions Φ(s + 1) − 1s and (LH)(s) = Φ(s+1)
s+1 − s both extend to
meromorphic functions on Re(s) > − 12 that are holomorphic on Re(s) ≥ 0.
Proof. The first statement follows immediately from the lemma. For the second, note that
Φ(s + 1) 1
1
1
1
− =
Φ(s + 1) −
−
s+1
s
s+1
s
s+1
is meromorphic on Re(s) > − 21 and holomorphic on Re(s) ≥ 0, since it is a sum of products
of such functions.
The final step of the proof relies on the following analytic result due to Newman [8].
Theorem 16.13. Let f : R≥0 → R be a bounded piecewise continuous function, and suppose
its
R ∞Laplace transform extends to a holomorphic function g(s) on Re(s) ≥ 0. Then the integral
0 f (t)dt converges and is equal to g(0).
Proof. Without
loss of generality weR assume f (t) ≤ 1 for all t ≥ 0. For τ ∈ R>0 , define
Rτ
∞
gτ (s) := 0 f (t)e−st dt, By definition 0 f (t)dt = limτ →∞ gτ (0), thus it suffices to prove
lim gτ (0) = g(0).
τ →∞
For r > 0, let γr be the boundary of the region {s : |s| ≤ r and Re(s) ≥ −δr } with
δr > 0 chosen so that g is holomorphic on γr ; such a δr exists because g is holomorphic
on Re(s) ≥ 0, hence on some open ball B≤2δ(y) (iy) for each y ∈ [−r, r], and we may take
6
As is standard when computing logarithmic derivatives, we are taking the principal branch of the complex
logarithm and can safely ignore the negative real axis where it is not defined since we are assuming Re(s) > 1.
18.785 Fall 2021, Lecture #16, Page 7 δr := inf{δ(y) : y ∈ [r, −r]}, which is positive because [−r, r] is compact. Each γr is a
2
simple closed curve, and for each τ > 0 the function h(s) := (g(s) − gτ (s))esτ (1 + rs2 ) is
holomorphic on a region containing γr . Using Cauchy’s integral formula (Theorem 16.26)
to evaluate h(0) yields
Z
1
1
s
(2)
g(0) − gτ (0) = h(0) =
g(s) − gτ (s) esτ
+ 2 ds.
2πi γr
s r
We will show the LHS tends to 0 as τ → ∞ by showing that for any > 0 we can set
r = 3/ > 0 so that the absolute value of the RHS is less than for all sufficiently large τ .
Let γr+ denote the part of γr in Re(s) > 0, a semicircle of radius r. The integrand is
absolutely bounded by 1/r on γr+ , since for |s| = r and Re(s) > 0 we have
sτ
g(s) − gτ (s) · e
1
s
+ 2
s r
=
1
2πi
Z
γr+
Z
∞
τ
∞
f (t)e−st dt ·
eRe(s)τ r s
· +
r
s r
eRe(s)τ 2 Re(s)
·
r
r
τ
Re(s)τ
−
Re(s)τ
e
e
2 Re(s)
=
·
·
Re(s)
r
r
2
= 2/r .
≤
Therefore
Z
e− Re(s)t dt ·
1
s
2
1
1
sτ
g(s) − gτ (s) e
+ 2 ds ≤
· πr · 2 =
s r
2π
r
r
(3)
Now let γr− be the part of γr in Re(s) < 0, a truncated semi-circle. For any fixed r, the
first term g(s)esτ (s−1 + sr−2 ) in the integrand of (2) tends to 0 as τ → ∞ for Re(s) < 0
and |s| ≤ r. For the second term we note that since gτ (s) is holomorphic on C, it makes no
difference if we instead integrate over the semicircle of radius r in Re(s) < 0. For |s| = r
and Re(s) < 0 we then have
gτ (s)e
sτ
1
s
+ 2
s r
=
Z
Z
τ
0
τ
f (t)e−st dt ·
eRe(s)τ r s
· +
r
s r
eRe(s)τ (−2 Re(s))
r
r
0
!
−
Re(s)τ
Re(s)τ
e
e
(−2 Re(s))
= 1−
Re(s)
r
r
≤
e− Re(s)t dt ·
= 2/r2 · (1 − eRe(s)τ Re(s)),
where the factor (1 − eRe(s)τ Re(s)) on the RHS tends to 1 as τ → ∞ since Re(s) < 0. We
thus obtain the bound 1/r + o(1) when we replace γr+ with γr− in (3), and the RHS of (2)
is bounded by 2/r + o(1) as τ → ∞. It follows that for any > 0, for r = 3/ > 0 we have
|g(0) − gτ (0)| < 3/r =
for all sufficiently large τ . Therefore limτ →∞ gτ (0) = g(0) as desired.
18.785 Fall 2021, Lecture #16, Page 8 Remark 16.14. Theorem 16.13 is an example of what is known as a Tauberian theorem.
For a piecewise continuous function f : R≥0 → R, its Laplace transform
Z ∞
Lf (s) :=
e−st f (t)dt,
0
is typically not defined on Re(s) ≤ c, where c is the least c for which f (t) = O(ect ). Now
it may happen that the function Lf has an analytic continuation to a larger domain; for
1
example, if f (t) = et then (Lf )(s) = s−1
extends to a holomorphic function on C−{1}. But
plugging values of s with Re(s) ≤ c into the integral usually does not work; in our f (t) = et
example, the integral diverges on Re(s) ≤ 1. The theorem says that when Lf extends to a
holomorphic function on the entire half-plane Re(s) ≥ 0, its value at s = 0 is exactly what
we would get by simply plugging 0 into the integral defining Lf .
More generally, Tauberian theorems refer to results related toRtransforms f → T (f ) that
∞
allow us to deduce properties of f (such as the convergence of 0 f (t)dt) from properties
of T (f ) (such as analytic continuation to Re(s) ≥ 0). The term “Tauberian" was coined by
Hardy and Littlewood and refers to Alfred Tauber, who proved a theorem of this type as a
partial converse to a theorem of Abel.
Theorem 16.15 (Prime Number Theorem). π(x) ∼
x
log x .
Proof. H(t) = ϑ(et )e−t − 1 is piecewise continuous and bounded, by Lemma 16.7, and its
Laplace transform extends to a holomorphic function on Re(s) ≥ 0, by Corollary 16.12.
Theorem 16.13 then implies that the integral
Z ∞
Z ∞
H(t)dt =
ϑ(et )e−t − 1 dt
0
0
converges. Replacing t with log x, we see that
Z ∞
Z ∞
1
dx
ϑ(x) − x
ϑ(x) − 1
=
dx
x
x
x2
1
1
converges. Lemma 16.8 implies ϑ(x) ∼ x, equivalently, π(x) ∼
x
log x ,
by Theorem 16.6.
One disadvantage of our proof is that it does not give us an error term. Using more
sophisticated methods, Korobov [6] and Vinogradov [14] independently obtained the bound
!
x
,
π(x) = Li(x) + O
exp (log x)3/5+o(1)
in which we note that the error term is bounded by O(x/(log x)n ) for all n but not by
O(x1− ) for any > 0. Assuming the Riemann Hypothesis, which states that the zeros of
ζ(s) in the critical strip 0 < Re(s) < 1 all lie on the line Re(s) = 12 , one can prove
π(x) = Li(x) + O(x1/2+o(1) ).
More generally, if we knew that ζ(s) has no zeros in the critical strip with real part greater
than c, for some c ≥ 1/2 strictly less than 1, we could prove π(x) = Li(x) + O(xc+o(1) ).
There thus remains a large gap between what we can prove about the distribution of
prime numbers and what we believe to be true. Remarkably, other than refinements to the
o(1) term appearing in the Korobov-Vinogradov bound, essentially no progress has been
made on this problem in the last 60 years.
18.785 Fall 2021, Lecture #16, Page 9 16.3
A quick recap of some basic complex analysis
The complex numbers C are a topological field under
the distance metric d(x, y) = |x − y|
√
induced by the standard absolute value |z| := z z̄, which is also a norm on C as an Rvector space; all references to the topology on C (open, compact, convergence, limits, etc.)
are made with this understanding.
16.3.1
Glossary of terms and standard theorems
Let f and g denote complex functions defined on an open subset of C.
• f is differentiable at z0 if limz→z0
f (z)−f (z0 )
z−z0
exists.
• f is holomorphic at z0 if it is differentiable on an open neighborhood of z0 .
• f is analytic at z0 if there
of z0 in which f can be defined by
P is an open neighborhood
n
a power series f (z) = n=0 an (z − z0 ) ; equivalently, f is infinitely differentiable and
has a convergent Taylor series on an open neighborhood of z0 .
• Theorem: f is holomorphic at z0 if and only if it is analytic at z0 .
• Theorem: If C is a connected set containing a nonempty open set U and f and g are
holomorphic on C with f|U = g|U , then f|C = g|C .
• With U and C as above, if f is holomorphic on U and g is holomorphic on C with
f|U = g|U , then g is the (unique) analytic continuation of f to C and f extends to g.
• If f is holomorphic on a punctured open neighborhood of z0 and |f (z)| → ∞ as z → z0
then z0 is a pole of f ; note that the set of poles of f is necessarily a discrete set.
• f is meromorphic at z0 if it is holomorphic at z0 or has z0 as a pole.
• Theorem:
at z0 then it can be defined by a Laurent series
P If f is meromorphic
n
f (z) = n≥n0 an (z − z0 ) that converges on an open punctured neighborhood of z0 .
• The order of vanishing ordz0 (f ) of a nonzero function f that is meromorphic at z0 is
the least index n of the nonzero coefficients an in its Laurent series expansion at z0 .
Thus z0 is a pole of f iff ordz0 (f ) < 0 and z0 is a zero of f iff ordz0 (f ) > 0.
• If ordz0 (f ) = 1 then z0 is a simple zero of f , and if ordz0 (f ) = −1 it is a simple pole.
• The residue resz0 (f ) of a function
P f meromorphic at z0 is the coefficient a−1 in its
Laurent series expansion f (z) = n≥n0 an (z − z0 )n at z0 .
• Theorem: If z0 is a simple pole of f then resz0 (f ) = limz→z0 (z − z0 )f (z).
• Theorem: If f is meromorphic on a set S then so is its logarithmic derivative f 0 /f ,
and f 0 /f has only simple poles in S and resz0 (f 0 /f ) = ordz0 (f ) for all z0 ∈ S. In
particular the poles of f 0 /f are precisely the zeros and poles of f .
16.3.2
Convergence
P
P
Recall that a series ∞
n=1 an of complex numbers converges absolutely if the series
n |an | of
nonnegative real numbers converges. An equivalent definition is that the function a(n) := an
is integrable with respect to the counting measure µ on the set of positive integers N. Indeed,
if the series is absolutely convergent then
Z
∞
X
an =
a(n)µ,
n=1
N
18.785 Fall 2021, Lecture #16, Page 10 and if the series is not absolutely convergent, the integral is not defined. Absolute convergence is effectively built-in to the definition of the Lebesgue integral, which requires that
in order for the function a(n) = x(n) + iy(n) to be integrable, the positive real functions
|x(n)| and |y(n)| must both be integrable (summable), and separately computes sums of the
positive and negative subsequences of (x(n)) and (y(n)) as suprema over finite subsets.
The measure-theoretic perspective has some distinct advantages. It makes it immediately
clear that we may replace the index set N with any set of the same cardinality, since the
counting measure depends only on the cardinality of N, not its ordering. We are thus free to
sum over any countable index set, including Z, Q, any finite product of countable sets, and
any countable coproduct of countable sets (such as countable direct sums of Z); such sums
are ubiquitous in number theory and many cannot be meaningfully interpreted as limits of
partial sums in the usual sense, since this assumes that the index set is well ordered (not
the case with Q, for example). The measure-theoretic view makes
P also makes it clear that
we
may
convert
any
absolutely
convergent
sum•
of
the
form
X×Y into an iterated sum
P P
theorem.
X
Y (or vice versa), via Fubini’s Q
We say that an infinite
product
is absolutely conn an of nonzero
P
Q complex numbers
P
vergent when the sum n log an is, in which case n an := exp( n log an ).7 This implies
that an absolutely convergent product cannot converge to zero, and the sequence (an ) must
converge to 1 (no matter how we order the an ). All of our remarks above about absolutely
convergent series apply to absolutely convergent products as well.
A series or product of complex functions fn (z) is absolutely convergent on S if the series
or product of complex numbers fn (z0 ) is absolutely convergent for all z0 ∈ S.
Definition 16.16. A sequence of complex functions (fn ) converges uniformly on S if there is
a function f such that for every > 0 there is an integer N for which supz∈S |fn (z)−f (z)| <
for all n ≥ N . The sequence (fn ) converges locally uniformly on S if every z0 ∈ S has an
open neighborhood U for which (fn ) converges uniformly on U ∩S. When applied to a series
of functions these terms refer to the sequence of partial sums.
Because C is locally compact, locally uniform convergence is the same thing as compact
convergence: a sequence of functions converges locally uniformly on S if and only if it
converges uniformly on every compact subset of S.
Theorem 16.17. A sequence or series of holomorphic functions fn that converges locally
uniformly on an open set U converges to a holomorphic function f on U , and the sequence
or series of derivatives fn0 then converges locally uniformly to f 0 (and if none of the fn has
a zero in U and f 6= 0, then f has no zeros in U ).
Proof. See [3, Thm. III.1.3] and [3, Thm. III.7.2].
P
Definition
16.18.
n (z) converges normally on a set S
P
P A series of complex functions n fP
if n kfn k := n supz∈S |fn (z)| converges. The series n fnP
(z) converges locally normally
on S if every z0 ∈ S has an open neighborhood U on which n fn (z) converges normally.
Theorem 16.19 (Weierstrass M-test). Every locally normally convergent
series of
P
functions converges absolutely and locally uniformly. Moreover, a series n fn of holomorphic functions on
converges locally normally converges to a holomorphic function f
PS that
0
on S, and then n fn converges locally normally to f 0 .
7
In this definition we use the principal branch of log z := log |z| + i Arg z with Arg z ∈ (−π, π).
18.785 Fall 2021, Lecture #16, Page 11 Proof. See [3, Thm. III.1.6].
P
Remark 16.20. To show a series n fn is locally normally convergent on a set S amounts
to proving that for every z0 ∈ S there is an open neighborhood P
U of z0 and a sequence of
real numbers (Mn ) such that |fn (z)| ≤ Mn for z ∈ U ∩ S and n Mn < ∞, whence the
term “M -test".
16.3.3
Contour integration
We shall restrict our attention to integrals along contours defined by piecewise-smooth parameterized curves; this covers all the cases we shall need.
Definition 16.21. A parameterized curve is a continuous function γ : [a, b] → C whose
domain is a compact interval [a, b] ⊆ R. We say that γ is smooth if it has a continuous
nonzero derivative on [a, b], and piecewise-smooth if [a, b] can be partitioned into finitely
many subintervals on which the restriction of γ is smooth. We say that γ is closed if
γ(a) = γ(b), and simple if it is injective on [a, b) and (a, b]. Henceforth we will use the term
curve to refer to any piecewise-smooth parameterized curve γ, or to its oriented image of in
the complex plane (directed from γ(a) to γ(b)), which we may also denote γ.
Definition 16.22. Let f : Ω → C be a continuous function and let γ be a curve in Ω. We
define the contour integral
Z
f (z)dz :=
γ
Z
b
f (γ(t))γ 0 (t)dt,
a
whenever the integralR on the RHS (which is defined as a Riemann sum in the usual way)
converges. Whether γ f (z)dz converges, and if so, to what value, does not depend on the
parameterization
of γ: ifR γ 0 is another parameterized curve with the same (oriented) image
R
as γ, then γ 0 f (z)dz = γ f (z)dz.
We have the following analog of the fundamental theorem of calculus.
Theorem 16.23. Let γ : [a, b] → C be a curve in an open set Ω and let f : Ω → C be a
holomorphic function Then
Z
f 0 (z)dz = f (γ(b)) − f (γ(a)).
γ
Proof. See [2, Prop. 4.12].
Recall that the Jordan curve theorem implies that every simple closed curve γ partitions C into two components, one of which we may unambiguously designate as the interior
(the one on the left as we travel along our oriented curve). We say that γ is contained in an
open set U if both γ and its interior lie in U . The interior of γ is a simply connected set, and
if an open set U contains γ then it contains a simply connected open set that contains γ.
Theorem 16.24 (Cauchy’s Theorem). Let U be an open set containing a simple closed
curve γ. For any function f that is holomorphic on U we have
Z
f (z)dz = 0.
γ
18.785 Fall 2021, Lecture #16, Page 12 Proof. See [2, Thm. 8.6] (we can restrict U to a simply connected set).
Cauchy’s theorem generalizes to meromorphic functions.
Theorem 16.25 (Cauchy Residue Formula). Let U be an open set containing a simple
closed curve γ. Let f be a function that is meromorphic on U , let z1 , . . . , zn be the poles of
f that lie in the interior of γ, and suppose that no pole of f lies on γ. Then
Z
f (z)dz = 2πi
γ
n
X
reszi (f ).
i=1
Proof. See [2, Thm. 10.5] (we can restrict U to a simply connected set).
R
it
To see where the 2πi comes from, consider γ dz
z with γ(t) = e for t ∈ [0, 2π]. In general one
weights residues by a corresponding winding number, but the winding number of a simple
closed curve about a point in its interior is always 1.
Theorem 16.26 (Cauchy’s Integral Formula). Let U be an open set containing a
simple closed curve γ. For any function f holomorphic on U and a in the interior of γ,
Z
1
f (z)
f (a) =
dz.
2πi γ z − a
Proof. Apply Cauchy’s residue formula to g(z) = f (z)/(z − a); the only poles of g in the
interior of γ are a simple pole at z = a with resa (g) = f (a).
Cauchy’s residue formula can also be used to recover the coefficients f (n) (a)/n! appearing
in the Laurent series expansion of a meromorphic function at a (apply it to f (z)/(z −a)n+1 ).
One of many useful consequences of this is Liouville’s theorem, which can be proved by
showing that the Laurent series expansion of a bounded holomorphic function on C about
any point has only one nonzero coefficient (the constant coefficient).
Theorem 16.27 (Liouville’s theorem). Bounded entire functions are constant.
Proof. See [2, Thm. 5.10].
We also have the following converse of Cauchy’s theorem.
Theorem 16.28 (Morera’s Theorem). Let f be a continuous function and on an open
set U , and suppose that for every simple closed curve γ contained in U we have
Z
f (z)dz = 0.
γ
Then f is holomorphic on U .
Proof. See [3, Thm. II.3.5].
18.785 Fall 2021, Lecture #16, Page 13 References
[1] Lars V. Ahlfors, Complex analysis: an introduction to the theory of analytic functions
of one complex variable, 3rd edition, McGraw-Hill, 1979.
[2] Joseph Bak and Donald J. Newman, Complex analysis, Springer, 2010.
[3] Rolf Busam and Eberhard Freitag, Complex analysis, 2nd edition, Springer 2009.
[4] Paul Erdös, On a new method in elementary number theory which leads to an elementary
proof of the prime number theorem, Proc. Nat. Acad. Scis. U.S.A. 35 (1949), 373–384.
[5] Jacques Hadamard, Sur la distribution des zéros de la function ζ(s) et ses conséquences
arithmétique, Bull. Soc. Math. France 24 (1896), 199–220.
[6] Nikolai M. Korobov, Estimates for trigonometric sums and their applications, Uspechi
Mat. Nauk 13 (1958), 185–192.
[7] Serge Lange, Complex analysis, 4th edition, Springer, 1985.
[8] David J. Newman, Simple analytic proof of the Prime Number Theorem, Amer. Math.
Monthly 87 (1980), 693–696.
[9] Charles Jean de la Vallée Poussin, Reserches analytiques sur la théorie des nombres
premiers, Ann. Soc. Sci. Bruxelles 20 (1896), 183–256.
[10] Bernhard Riemann, Über die Anzahl der Primzahlen unter einer gegebenen Grösse,
Monatsberichte der Berliner Akademie, 1859.
[11] Alte Selberg, An elementary proof of the Prime-Number Theorem, Ann. Math. 50
(1949), 305–313.
[12] Elias M. Stein and Rami Shakarchi, Complex analysis, Princeton University Press, 2003.
[13] Alfred Tauber, Ein Satz aus der Theorie der unendlichen Reihen, Monatsh f. Mathematik und Physik 8 (1897), 273–277.
[14] Ivan M. Vinogradov, A new estimate of the function ζ(1 + it), Izv. Akad. Nauk SSSR.
Ser. Mat. 22 (1958), 161–164.
[15] Don Zagier, Newman’s short proof of the Prime Number Theorem, Amer. Math. Monthly
104 (1997), 705–708.
18.785 Fall 2021, Lecture #16, Page 14 18.785 Number theory I
Lecture #18
18
Fall 2021
11/10/2021
Dirichlet L-functions, primes in arithmetic progressions
Having proved the prime number theorem, we would like to prove an analogous result for
primes in arithmetic progressions. We begin with Dirichlet’s theorem on primes in arithmetic
progressions, a result that predates the prime number theorem by sixty years.
Theorem 18.1 (Dirichlet 1837). For all coprime integers a and m there are infinitely many
primes p ≡ a mod m.
In fact Dirichlet proved more than this. In a sense that we will make precise , he proved
that for every fixed modulus m the primes are equidistributed among the residue classes in
(Z/mZ)× . The equidistribution statement that Dirichlet was able to prove is a bit weaker
than one might like, but it is more than enough to establish Theorem 18.1.
Remark 18.2. Many of the standard tools of complex analysis we take for granted were not
available to Dirichlet in 1837. Riemann was the first to seriously study ζ(s) as a function of
a complex variable, some twenty years after Dirichlet proved Theorem 18.1. We will work
in a more modern setting, but our approach follows the spirit of Dirichlet’s proof.
18.1
Infinitely many primes
To motivate Dirichlet’s method of proof, let us consider the following (admittedly clumsy)
proof that there are infinitely many primes. It is sufficient to show that the Euler product
Y
ζ(s) =
(1 − p−s )−1
p
diverges as s → 1+ . Of course we know ζ(s) has a pole at s = 1 (by Theorem 16.3), but let
us suppose for the moment that we did not already know this. Taking logarithms yields
X
X
log ζ(s) = −
log(1 − p−s ) =
p−s + O(1),
(1)
p
p
as s → 1+ , where we have used the asymptotic bounds
X
− log(1 − x) = x + O(x2 ) (as x → 0)
and
O(p−2s ) = O(1) (Re(s) > 1/2 + ).
p
We can estimate
P
1
p≤x p
via Mertens’ second theorem, one of three he proved in [4].
Theorem 18.3 (Mertens 1874). As x → ∞ we have
X log p
1
(1)
p = log x + R(x), where |R(x)| < 2.
p≤x
(2)
X
1
p
= log log x + B + O
p≤x
(3)
X
p≤x
log
1− p1
1
log x
, where B =0.261497 . . . is Mertens’ constant;
= −log log x − γ + O
1
log x
, where γ =0.577216 . . . is Euler’s constant.
Proof. See Problem Set 9.
1
In fact, R(x) = −B3 + o(1) where B3 =1.332582 . . . is an explicit constant. P −s
Thus not only does
p diverge as s → 1+ , we can say with a fair degree of precision
how quickly this happens. We should note, however, that Mertens’ estimate is not as strong
as the prime number theorem. Indeed, as you will prove on Problem Set 9, the Prime
Number Theorem is equivalent to the statement
X1
= log log x + B + o log1 x ,
p
p≤x
which is (ever so slightly) sharper than Mertens’ estimate.2
18.2
Dirichlet characters
We now define the notion of a Dirichlet character. Historically, these preceded the notion of
a group character; they were introduced by Dirichlet in 1831, well before the notion of an
abstract group was in common use.3 In order to simplify the exposition we will occasionally
invoke some standard facts about characters of finite abelian groups that we recall in §18.6.
Definition 18.4. A function f : Z → C is called an arithmetic function.4 The function
f is multiplicative if f (1) = 1 and f (mn) = f (m)f (n) for all coprime m, n ∈ Z; it is
totally multiplicative (or completely multiplicative) if f (1) = 1 and f (mn) = f (m)f (n) for
all m, n ∈ Z. For m ∈ Z>0 we say that f is m-periodic if f (n + m) = f (n) for all n ∈ Z,
and we call m the period of f it is the least m > 0 for which this holds.
Definition 18.5. A Dirichlet character is a periodic totally multiplicative arithmetic function χ : Z → C.
The image of a Dirichlet character is a finite multiplicatively closed subset of C, hence
the union of a finite subgroup of U(1) and a subset of {0}. The constant function 1(n) := 1
is the trivial Dirichlet character ; it is the unique Dirichlet character of period 1. Each mperiodic Dirichlet character χ restricts to a group character χ on (Z/mZ)× . Conversely,
every group character χ of (Z/mZ)× can be extended to a Dirichlet character χ by defining
χ(n) = 0 for n 6⊥ m; this is called extension by zero.
Definition 18.6. A Dirichlet character of modulus m is an m-periodic Dirichlet character χ
that is the extension by zero of a group character on (Z/mZ)× ; equivalently, an m-periodic
Dirichlet character for which χ(n) 6= 0 ⇐⇒ m ⊥ n.
Remark 18.7. Some authors only define Dirichlet characters of modulus m, thinking of
them as extensions by zero of group characters on (Z/mZ)× , in which case every χ has an
attached modulus m. But note that the function Z → C given by the extension by zero does
not uniquely determine m (see Lemma 18.8 below). Indeed, the unique Dirichlet character
of modulus 2 is a Dirichlet character of modulus 2k for all k ≥ 1.
The Dirichlet characters of modulus m form a group under pointwise multiplication that
is canonically isomorphic to the character group of (Z/mZ)× . Not every m-periodic Dirichlet
2
The error term in the PNT actually implies
P
1
p≤x p
= log log x + B + O
1
, but an o( log1 x ) bound is
x
already enough to show π(x) ∼ x/ log x. That the difference between a little-o and a big-O is the difference
between proving the PNT and not proving it demonstrates how critical it is to understand error terms.
3
Galois’ seminal paper was rejected that same year; it wasn’t published until 12 years after his death.
4
Many authors restrict the domain of an arithmetic function to Z≥1 ; for the periodic arithmetic functions
we are interested in here, this distinction is irrelevant, and it is slightly more natural to work with Z.
18.785 Fall 2021, Lecture #18, Page 2 character χ is a Dirichlet character of modulus m, since an m-periodic Dirichlet character
need not vanish on n 6⊥ m, but if χ has period m then this holds. More generally, we have
the following lemma.
Lemma 18.8. Let χ be a Dirichlet character of period m. Then χ is a Dirichlet character
of modulus m0 if and only if m|m0 |mk for some k (which holds in particular for m0 = m).
Proof. To prove χ is a a Dirichlet character of modulus m we must show χ(n) 6= 0 ⇔ m ⊥ n.
Suppose χ(n) 6= 0 with m 6⊥ n, and let p be a common divisor of m and n. Then χ(p) 6= 0,
since χ(p)χ(n/p) = χ(n) 6= 0, and for any r ∈ Z we have
χ(r)χ(p) = χ(rp) = χ(rp + m) = χ(r + m/p)χ(p),
which implies χ(r) = χ(r + m/p), since χ(p) 6= 0. Thus χ is (m/p)-periodic, but this
contradicts the minimality of the period m. Therefore χ(n) 6= 0 ⇒ m ⊥ n.
For any n ⊥ m we can pick a = ne ≡ 1 mod m so that χ(1) = χ(a) = χ(ne ) = χ(n)e 6= 0,
in which case χ(n) 6= 0. Thus n ⊥ m ⇒ χ(n) 6= 0, so χ is a Dirichlet character of modulus m.
If m|m0 |mk , then the prime divisors of m0 coincide with those of m. It follows that
n ⊥ m0 ⇐⇒ n ⊥ m ⇐⇒ χ(n) 6= 0,
and χ is clearly m0 -periodic (since m|m0 ), so χ is a Dirichlet character of modulus m0 .
Conversely, if χ is a Dirichlet character of modulus m0 , then χ is m0 -periodic, and
therefore m|m0 , since m is the period of χ. And since χ is a Dirichlet character of modulus m
and of modulus m0 , for each prime p we have
p|m ⇐⇒ χ(p) = 0 ⇐⇒ p|m0 ,
thus the prime divisors of m and m0 coincide and m0 must divide some power mk of m.
18.2.1
Primitive Dirichlet characters
Given a Dirichlet character χ1 of modulus m1 dividing m2 , we can always create a Dirichlet
character χ2 of modulus m2 by taking the extension by zero of the restriction of χ1 to
(Z/m2 Z)× ; in other words, let χ2 (n) := χ1 (n) for n ∈ (Z/m2 Z)× and χ2 (n) := 0 otherwise.
If m2 is divisible by a prime p that does not divide m1 , the Dirichlet characters χ1 and χ2
will not be the same (χ2 (p) = 0 6= χ1 (p), for example), they will agree on n ∈ (Z/m2 Z)×
but not on n ∈ (Z/m1 Z)× .5 We can create infinitely many new Dirichlet characters from
χ1 in this way, but they will differ from χ1 only in a rather trivial sense. We would like to
distinguish the Dirichlet characters that arise in this way from those that do not.
Definition 18.9. Let χ1 and χ2 be Dirichlet characters of modulus m1 and m2 , respectively,
with m1 |m2 . If χ2 (n) = χ1 (n) for n ∈ (Z/m2 Z)× then χ2 is induced by χ1 . A Dirichlet
character that is not induced by any character other than itself is primitive.
Lemma 18.10. A Dirichlet character χ2 of modulus m2 is induced by a Dirichlet character
of modulus m1 |m2 if and only if χ2 is constant on residue classes in (Z/m2 Z)× that are
congruent modulo m1 . When this holds, the Dirichlet character χ1 of modulus m1 that
induces χ2 is uniquely determined.
5
Note that while #(Z/m1 Z)× ≤ #(Z/m2 Z)× , the set of integers n ∈ (Z/m1 Z)× (the n coprime to m1 )
contains the set of integers n ∈ (Z/m2 Z)× (the n coprime to m2 ) and is usually larger.
18.785 Fall 2021, Lecture #18, Page 3 Proof. If χ2 is induced by χ1 then it must be constant on residue classes in (Z/m2 Z)× that
are congruent modulo m1 , since χ1 is. To prove the converse we first show that the surjective
ring homomorphism Z/m2 Z → Z/m1 Z given by reduction modulo m1 induces a surjective
homomorphism π : (Z/m2 Z)× → (Z/m1 Z)× of unit groups.6
Suppose u1 ∈ Z is a unit modulo m1 . Let a be the product of all primes dividing m2 /m1
but not u1 . Then u2 = u1 + m1 a is not divisible by any prime p|m1 (since u1 isn’t), nor is it
divisible by any prime p|(m2 /m1 ): by construction, such a p divides exactly one of u1 and
m1 a. Thus u2 is a unit modulo m2 that reduces to u1 modulo m1 and π is surjective.
If χ2 is a Dirichlet character of modulus m2 constant on fibers of π we can define
a Dirichlet character χ1 of modulus m1 via χ1 (n1 ) := χ2 (n2 ) for n1 ∈ (Z/m1 Z)× with
n2 ∈ π −1 (n1 ) (any such n2 will do). Thus χ1 induces χ2 , and if χ01 also induces χ2 it must
satisfy the same condition χ1 (n1 ) = χ2 (n2 ) that uniquely determines χ1 .
Definition 18.11. A Dirichlet character χ induced by 1 is called principal (and is primitive
if and only if χ = 1). For m ∈ Z>0 we use 1m to denote the principal Dirichlet character of
modulus m; it corresponds to the trivial character of (Z/mZ)× .
Lemma 18.12. Let χ be a Dirichlet character of modulus m. Then
X
χ(n) 6= 0 ⇐⇒ χ = 1m .
n ∈ Z/mZ
Proof. We have χ(n) = 0 for n 6∈ (Z/mZ)× , and the sum over (Z/mZ)× is nonzero if and
only if χ restricts to the trivial character on (Z/mZ)× , by the orthogonality of characters;
see Corollary 18.38.
Note that the principal Dirichlet characters 1m and 1m0 necessarily coincide when
m|m0 |mk ; for example the principal Dirichlet character of modulus 2 (the parity function)
is the same as the principal Dirichlet character of modulus 4 (and every power of 2).
Theorem 18.13. Every Dirichlet character χ is induced by a primitive Dirichlet character χ
e
that is uniquely determined by χ.
Proof. Let us define a partial ordering on the set of all Dirichlet characters by defining
χ1 χ2 if χ1 induces χ2 . The relation is clearly reflexive, and it follows from Lemma 18.10
that it is transitive.
Let χ be a Dirichlet character of period m and consider the set X = {χ0 : χ0 χ}. Each
χ0 ∈ X necessarily has period m0 dividing m and there is at most one χ0 of period m0 for
each divisor m0 of m, by Lemma 18.10. Thus X is finite, and nonempty (since χ ∈ X).
Suppose χ1 , χ2 ∈ X have periods m1 and m2 , respectively. Then m1 and m2 both divide
m, as does m3 = gcd(m1 , m2 ). We have a commutative square of surjective unit group
homomorphisms induced by reduction maps:
←
(Z/m3 Z)× .
(Z/m2 Z)×
←
(Z/m1 Z)×
←
←
(Z/mZ)×
6
In fact, one can show that every surjective homomorphism of finite rings induces a surjective homomorphism of unit groups, but this does not hold in general (consider Z → Z/5Z, for example).
18.785 Fall 2021, Lecture #18, Page 4 From Lemma 18.10 we know that χ is constant on residue classes in (Z/mZ)× that are
congruent modulo either m1 or m2 , and therefore χ is constant on residue classes in (Z/mZ)×
that are congruent modulo m3 , as are χ1 and χ2 (which are determined by χ). It follows
that there is a unique Dirichlet character χ3 of modulus m3 that induces χ, χ1 , and χ2 .
Thus every pair χ1 , χ2 ∈ X has a lower bound χ3 under the partial ordering that is
compatible with the total ordering of X by period. This implies that X contains a unique
element χ
e that is minimal, both with respect to the partial ordering and with respect to
the total ordering by period; it must be primitive, by the transitivity of .
Definition 18.14. The conductor of a Dirichlet character χ is the period of the unique
primitive Dirichlet character χ
e that induces χ.
P
Corollary 18.15. For a Dirichlet character χ of modulus m we have n∈Z/mZ χ(n) 6= 0 if
and only if χ has conductor 1.
Proof. This follows immediately from Lemma 18.12.
Corollary 18.16. Let M (m) denote the set of Dirichlet characters of modulus m, let X(m)
b
denote the set of primitive Dirichlet characters of conductor dividing m, and let G(m)
denote
×
the character group of (Z/mZ) . We have canonical bijections
∼
∼
b
M (m) −→ X(m) −→ G(m)
χ 7−→ χ
e
7−→ (n 7→ χ
e(n)).
Proof. By Theorem 18.13, the map χ → χ
e is injective, and it is also surjective: each
χ
e ∈ X(m) induces the character χ ∈ M (m) by setting χ(n) := χ
e(n) for n ∈ (Z/mZ)×
and extending by zero. As previously noted, the map χ → (m 7→ χ(m)) defines a bijection
b
M → G(m)
(a group isomorphism, in fact), and this bijection factors through the map
χ 7→ χ
e, since χ
e(n) = χ(n) for n ∈ (Z/mZ)× .
Remark 18.17. Corollary 18.16 implies that we can make X(m) a group by defining
χ
e1 χ
e2 := χ
]
]
e1 and χ
e2 (which is typically
1 χ2 . Note that χ
1 χ2 is not the pointwise product of χ
not primitive), it is the unique primitive character that induces the pointwise product.
Example 18.18. 12-periodic Dirichlet characters, ordered by period m and conductor c.
m
1
2
3
3
4
6
6
12
12
c
1
1
1
3
4
1
3
4
12
0
1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
2
1
0
1
-1
0
0
0
0
0
3
1
1
0
0
-1
0
0
0
0
4
1
0
1
1
0
0
0
0
0
5
1
1
1
-1
1
1
-1
1
-1
6
1
0
0
0
0
0
0
0
0
7
1
1
1
1
-1
1
1
-1
-1
8
1
0
1
-1
0
0
0
0
0
9
1
1
0
0
1
0
0
0
0
10
1
0
1
1
0
0
0
0
0
11
1
1
1
-1
-1
1
-1
-1
1
mod-12
no
no
no
no
no
yes
yes
yes
yes
principal
yes
yes
yes
no
no
yes
no
no
no
primitive
yes
no
no
yes
yes
no
no
no
yes
The fact that χ(n) ∈ {0, ±1} for all 12-periodic Dirichlet characters χ follows from the
fact that the exponent of (Z/mZ)× is 2; thus (im χ) ∩ U(1) ⊆ µ2 = {±1}.
18.785 Fall 2021, Lecture #18, Page 5 18.3
Dirichlet L-functions
Definition 18.19. The Dirichlet L-function associated to a Dirichlet character χ is
Y
X
χ(n)n−s .
L(s, χ) :=
(1 − χ(p)p−s )−1 =
p
n≥1
The sum and product converge absolutely for Re s > 1, since |χ(n)| ≤ 1, thus L(s, χ) is
holomorphic on Re(s) > 1.
For the trivial Dirichlet character 1 we have L(s, 1) = ζ(s). For the principal character
1m of modulus m induced by 1 we have
ζ(s) = L(s, 1m )
Y
p|m
(1 − p−s )−1 .
The product on the RHS is finite, hence bounded and nonzero as s → 1+ , so the L-function
L(s, 1m ) has a simple pole at s = 1 with residue
Y
Y
φ(m)
ress=1 L(s, 1m ) = lim (s − 1)ζ(s) (1 − p−s ) =
(1 − p−1 ) =
.
+
m
s→1
p|m
p|m
The L-functions of non-principal Dirichlet characters do not have a pole at s = 1.
Proposition 18.20. Let χ be a non-principal Dirichlet character of modulus m. Then
L(s, χ) extends to a holomorphic function on Re s > 0.
Proof. Define the function T : R≥0 → C by
T (x) :=
X
χ(n).
0
18.785 Number Theory I Full Lecture Notes (F2021)
Get your assignment done in just 3 hours. Quick, easy, and available 24/7.
Report
Tell us what’s wrong with it:
Thanks, got it!
We will moderate it soon!
Our EduBirdie Experts Are Here for You 24/7! Just fill out a form and let us know how we can assist you.
Enter your email below and get instant access to your document