Littlewood’s Three Principles

In measure theory, the notion of measurability restricts sets and functions so that limit operations are sensible. With non-measurabilty manifesting only through the Axiom of Choice, measurability is a semantically rich class, and in particular, Littlewood’s Three Principles specify precisely how measurability relates back to more elementary building blocks as follows:

  1. Every measurable set is nearly open or closed.
  2. Every measurable function is nearly continuous.
  3. Every convergent sequence of measurable functions is nearly uniformly convergent.

Here in this post, we flesh out these principles in detail, with an emphasis on how these various concepts approximate one another. Each section corresponds to one principle and is independent from another. We primarily focus on the Lebesgue measure m on Euclidean space before discussing the abstract generalization. While there are many equivalent definitions of measurability, for simplicity we will only use these definitions in this discussion:

A set E\subset\mathbb{R}^{d} is measurable if for every \eps>0, there is some open set V\subset\mathbb{R}^{d} containing E such that m(V\backslash E)\leq\eps, where m is the Lebesgue measure.
A function f:\mathbb{R}^{d}\rightarrow\mathbb{R} is measurable if for every open interval I\subset\mathbb{R}, the preimage of I under f, f^{-1}(I), is a measurable set.

We will use these facts about the Lebesgue measure throughout.

Let m be the Lebesgue measure on \mathbb{R}^d. Let \{E_n\}_{n\in \mathbb{N}} be a collection of measurable sets.
1. (Monotonicity) If E_1 \subset E_2, then m(E_1) \leq m(E_2).
2. (Sub-additivity) m(\bigcup_n E_n) \leq \sum_n m(E_n).
3. (Additivity) If E_n are disjoint, m(\bigcup_n E_n) = \sum_n m(E_n).
4. (Upward monotone convergence) If E_1 \subset E_2 \subset \ldots, m(\bigcup_n E_n) = \lim_{n\rightarrow\infty} m(E_n).
5. (Downward monotone convergence) If E_1 \supset E_2 \supset \ldots with m(E_n)<\infty for some n, then m(\bigcap_n E_n) = \lim_{n\rightarrow\infty} m(E_n).

First Principle

By definition, every measurable set is already arbitrary close in “volume” to an open set. Nonetheless, we can say a lot more about its structure: in \mathbb{R}^{d}, a measurable set can be approximated by a disjoint collection of cubes. In the visual setting where d=2, a measurable set can be approximated by a “pixelation” of squares. We first define the notion of cubes.

We say that Q\subset\mathbb{R}^{d} is a closed cube of length \ell left-aligned at (a_{1},\ldots,a_{d})\in\mathbb{R}^{d} if Q=\Pi_{i=1}^{d}[a_{i},a_{i}+\ell].

The structure of measurable sets actually inherit from the structure of open sets, as stated in the following proposition:

Let V\subset\mathbb{R}^{d} be open. Then V is a countable union of closed cubes, and every pair of these cubes is disjoint except at the boundary.

We first state and prove the First Principle before proving the proposition.

Let E\subset\mathbb{R}^{d} be a measurable set with finite measure. For every \eps>0, there exist N disjoint cubes Q_{1},\ldots,Q_{N} such that m(E\bigtriangleup Q)\leq\eps, where Q=\cup_{n=1}^{N}Q_{n}.
(of theorem) Fix \eps>0. It suffices to show that there exist N disjoint cubes Q_{1},\ldots,Q_{N} such that m(E\backslash\bigcup_{n=1}^{N}Q_{n})\leq\eps/2 and m(\bigcup_{n=1}^{N}Q_{n}\backslash E)\leq\eps/2. Since E is measurable, there is an open set V containing E such that m(V\backslash E)\leq\eps/2. In particular, m(V)<\infty since m(E)<\infty. From the above proposition, V=\bigcup_{n}Q'_{n}, where Q'_{n} is a closed cube and pairwise disjoint except at the boundary. For each closed cube Q'_{n}, there exists a cube Q_{n}\subsetneq Q'_{n} such that m(Q_{n})\geq m(Q'_{n})-\frac{\eps}{4}\cdot 2^{-n}, and Q_{n} is pairwise disjoint. Now, by the upward monotone convergence of measure, there is some N such that m(\bigcup_{n=1}^{N}Q'_{n})\geq m(V)-\eps/4. Let Q=\bigcup_{n=1}^{N}Q_{n}. Then m(Q\backslash E)\leq m(V\backslash E)\leq\eps/2 by the monotonicity of m. For the other case,

    \begin{align*} m(E\backslash Q) & \leq m(V\backslash Q)\\  & = m(V)-m(Q)\\  & = m(V)-\sum_{n=1}^{N}m(Q_{n})\\  & \leq m(V)-\sum_{n=1}^{N}m(Q'_{n})+\frac{\eps}{4\cdot2^{n}}\\  & \leq m(V)-\sum_{n=1}^{N}m(\bigcup_{n=1}^{N}Q'_{n})+\frac{\eps}{4}\\  & \leq \eps/2, \end{align*}

where the lines follow from the monotonicity of m, additivity of m, disjointness of Q_{n}, construction of Q_{n}, sub-additivity of m, and construction of Q'_{n}, respectively.

(of proposition) Note that \mathbb{R}^{d} can be divided into cubes of length 2^{-n}:

    \[ G_{n}=\{ \Pi_{i=1}^{d}[a_{i}2^{-n},(a_{i}+1)2^{-n}]:a_{1},\ldots,a_{d}\in\mathbb{Z}\}, \]

with \mathbb{R}^{d}=\bigcup_{B\in G_{n}}B. Let V be an open set in \mathbb{R}^{d}, and define

    \[ Q=\bigcup_{n\in\mathbb{N}}\{B\in G_{n}:B\subset V\}. \]

In other words, Q contains all cubes of length 2^{-n} for some integer n that reside entirely within V. Since G_n is countable and Q is a subset of countable union of G_n, Q is countable. By construction Q\subset V. For the other direction, since V is open, for every v\in V, there is an open ball centered at v of radius r, B(v,r), that is contained inside V. By Pythagorean Theorem, there exists a cube with length 2n centered at v that is contained inside B(v,r). For every i, let a_{i}=\max_{z\in\mathbb{Z}} \{z\cdot2^{-n}:z2^{-n}\leq v_i\}. By maximality, v_{i}\in[a_{i}, a_{i}+2^{-n}], with a_{i} and a_{i}+2^{-n} at a distance less than r from v_i. Thus, v is inside the cube of length n left-aligned at (a_{1},\ldots,a_{d}), implying v\in Q. Lastly, to see that cubes are almost disjoint except at the boundary, note we may modify Q so that if a cube is selected from G_{n}, then we can remove its sub-cubes from G_{m} for every m>n. By this modification, a selected cube in Q is never contained in another.

The First Principle can also be extended into any abstract measure space (X,\mathcal{B},\mu), where X is the underlying set, \mathcal{B} is the set of measurable subsets of X, and \mu is the measure for (X,\mathcal{B}). In the concrete setting, let X be a bounded subset of \mathbb{R}^{d}, \mathcal{B} the set of all measurable sets contained in X, and \mu the Lebesgue measure. The First Principle effectively states that every measurable set is equivalent (up to a difference of measure zero) to a countable union of closed sets, and a direct generalization of what we have just proved is that every bounded element of the \sigma-algebra “generated” from closed sets can be approximated by closed sets themselves. More precisely,

Let (X,\mathcal{B},\mu) be a measure space with \mu(X)<\infty. Suppose \mathcal{A\subset B} is an algebra. Then for every \eps>0, E\in\sigma(\mathcal{A}), where \sigma(\mathcal{A}) is the intersection of all \sigma-algebra containing \mathcal{A}, there exists F\in\mathcal{A} such that \mu(E\triangle F)\leq\eps.
Fix \eps>0 and let \mathcal{P}=\{E\subset X:\exists F\in\mathcal{A}\text{ with }\mu(E\triangle F)\leq\eps\}. Since \mathcal{P} contains \mathcal{A}, it suffices to show that \mathcal{P} is itself a \sigma-algebra. It is easy to see that \mathcal{P} is closed under complement since \mathcal{A} is. For countable union, for every n\in\mathbb{N}, let E_{n}\in\mathcal{P}, E=\bigcup_{n}E_{n}. By definition, for every n\in\mathbb{N}, there is some F_{n}\in\mathcal{A} such that \mu(E_{n}\triangle F_{n})\leq\eps'\cdot2^{-n}, where \eps' will be chosen later. By upward monotone convergence, there exists some N\in\mathbb{N} such that \mu(\bigcup_{n=1}^{N}E_{n})\geq\mu(E)-\eps'. Write E^{N}=\bigcup_{n=1}^{N}E_{n} and F^{N}=\bigcup_{n=1}^{N}F_{n}, and note that F^{N} is in \mathcal{A}. Then

    \begin{align*} \mu(F^{N}\backslash E) & \leq\mu(F^{N}\backslash E^{N})\\  & \leq\sum_{n=1}^{N}\mu(F_{n}\backslash E_{n})\\  & \leq\eps', \end{align*}

where the inequalities follow from the monotocity of \mu, sub-additivity
of \mu, and our choice of F_{n}, respectively. Similarly,

    \begin{align*} \mu(E\backslash F^{N}) & =\mu(E^{N}\backslash F^{N})+\mu((E\backslash E^{N})\backslash F^{N})\\  & \leq\sum_{n=1}^{N}\mu(E_{n}\backslash F_{n})+\mu(E\backslash E^{N})\\  & \leq\eps'+\eps'. \end{align*}

Thus, \mu(E\triangle F^{N})\leq 3\eps', and setting \eps'=\eps/3 finishes the proof.

Second Principle

(Lusin’s Theorem) Let f:\mathbb{R}^{d}\rightarrow\mathbb{R} be a finite, measurable function. For every \eps>0, there exists A\subset\mathbb{R}^{d} such that m(\mathbb{R}^{d}\backslash A)<\eps and f_{|A}, the restriction of f to A, is continuous.

As an example, consider the indicator function 1_{\mathbb{Q}}:[0,1]\rightarrow\{0,1\} where \mathbb{Q} is the set of rationals. When restricted to the domain of [0,1]\backslash\mathbb{Q}, the function becomes identically zero. However, the function remains discontinuous at the points in [0,1]\backslash\mathbb{Q}. Thus, the continuity asserted in Lusin’s Theorem applies only to the restriction of the original function.

Before proving Lusin’s Theorem, we first prove that the continuous functions are dense in absolutely integrable functions.

Let f:\mathbb{R}^{d}\rightarrow\mathbb{R} be a measurable function. Suppose \int|f|<\infty. Then for every \eps>0, there exists a continuous function g such that \int|f-g|\leq\eps.
We proceed in stages, where in each stage f is restricted to an elementary, but more general class of functions than the previous stage. Note that in all cases, the hypothesis that f is measurable and \int |f| < \infty always applies.

Case 1: f=1_{Q}, where Q is a cube in \mathbb{R}^{d} (as defined in the First Principle).
When d=1, Q is an interval, and f has two points of discontinuity at the endpoints of this interval. A continuous function g can be defined so that it agrees with f everywhere, except when near each endpoint, g is a nearly vertical line dropping from 1 to 0. The case for larger d can be similarly constructed.

Case 2: f=\sum_{Q\in\mathcal{C}}1_{Q}, where \mathcal{C} is a finite collection of cubes.
By Case 1, each 1_{Q} has a continuous function g_{Q} such that \int|1_{Q}-g_{Q}|\leq\eps/|\mathcal{C}|. \sum_{Q\in\mathcal{C}}g_{Q} is continuous, and by triangle inequality, \int|f-\sum_{Q\in\mathcal{C}}g_{Q}|\leq\eps.

Case 3: f=1_{E}, where E is a measurable set.
From the First Principle, E can be approximated by a finite union of disjoint cubes, and the result follows by Case 2 and applying triangle inequality.

Case 4: f is simple, i.e., f=\sum_{E\in\mathcal{E}}1_{E}, where \mathcal{E} is a finite collection of measurable sets.
Follows from a similar argument as in Case 2.

Case 5: f:\mathbb{R}^{d}\rightarrow[0,\infty).
By definition, \int f=\sup_{g\leq f,g\text{ simple}}\int g. Since \int f<\infty, there exists some simple g\leq f with \int f-g\leq O(\eps). Result follows from Case 4 and triangle inequality.

Case 6: General case.
Write f=f_{+}-f_{-}, where f_{+}=\max(f,0) and f_{-}=\min(f,0). Then Case 5 applies to f_{+} and f_{-}, and the general case follows from Case 5 and triangle inequality.

(Lusin’s Theorem) We proceed in stages by proving the statement for progressively more general classes of functions. In all casees, the hypothesis that f is measurable and finite always applies.

Case 1: \int|f|<\infty.
Fix \eps>0. By the previous proposition, for every positive integer n, \eps_{n}>0 (to be specified), there is a continuous function g_{n} such that \int|f-g_{n}|\leq\eps_{n}. Define

    \[ E_{n}=\{x\in\mathbb{R}^{d}:|f(x)-g_{n}(x)|\geq1/n\}. \]

By Markov’s Inequality, m(E_{n})\leq n\eps_{n}. Then m(\bigcup_{n}E_{n})\leq\sum_{n}n\eps_{n}, which is at most \eps if we specify \eps_{n}:=\frac{\eps}{n2^{n}}. Let A:=\mathbb{R}^{d}\backslash\bigcup_{n}E_{n}. Then by construction, g_{n} converges to f uniformly inside A. The restriction of g_{n} to A remains continuous, and since continuity is perserved under uniform limit, f_{|A} is continuous as well.

Case 2: f is a bounded function.
Consider the “vertical” truncation of f: f_{N}(x):=f(x)1_{\{|x|\leq N\}}. Since f is bounded, f_{N} is bounded and \int|f_{N}|<\infty. By Case 1, there exists some A_{N} such that m(\mathbb{R}^{d}\backslash A_{N})\leq\eps2^{-N}, with f_{N} continuous when restricted to A_{N}. Let A=\bigcap_{N}A_{N}, then m(\mathbb{R}^{d}\backslash A)\leq\eps, thus every vertical truncation f_{N} is continuous when restricted to A. Since continuity is a local property, for every x\in A, f must agree with some f_{N} on a neighborhood of x, so f is continuous on A as well.

Case 3: General case.
Consider the “horizontal” truncation of f: f^{(N)}(x):=f(x)1_{\{|f(x)|\leq N\}}. Since f^{(N)} is bounded, by Case 2, there is some A_{N} such that m(\mathbb{R}^{d}\backslash A_{N})\leq\eps2^{-N}, with f^{(N)} continuous when restricted to A_{N}. Similarly as in Case 2, we define A=\bigcap_{N}A_{N}, with m(\mathbb{R}^{d}\backslash A)\leq\eps, and every truncation f_{N} is continuous when restricted to A. Since f is finite, for every x\in A, f must agree f^{(N)} for some N on a neighborhood of x and is continuous on A as well.

One may extend Lusin’s Theorem to an abstract measure space X, where X is Radon so that its measure is also compatibile with its topology so that continuity can be defined appropriately.

Third Principle

(Egorov’s Theorem) X\subset\mathbb{R}^{d}, with m(X)<\infty. Let f_{n}:X\rightarrow\mathbb{R} be a sequence of measurable functions with f_{n} converging to f pointwise as n\rightarrow\infty. Then for every \eps>0, there exists a measurable subset A\subset X with m(X\backslash A)\leq\eps, where f_{n} converging to f uniformly within A.
Egorov’s Theorem also implies Lusin’s Theorem, the Second Principle. If f is measurable, by an alternate definition of measurability, it is also the pointwise limit of simple functions. These simple functions are also approximated by continuous functions. By Egorov’s Theorem, these continuous functions uniformly converge to f except on a small subset E. Since continuity is preserved under uniform convergence, f is continuous on a domain without E.
By definition, f_{n} converges to f uniformly on A iff for every x\in A, for every positive integer m, there is some positive integer N_{m} with |f_{n}(x)-f(x)|<1/m. Let

    \[ A_{m,N}=\{x\in X:\forall n\geq N\big|f_{n}(x)-f(x)\big|<1/m\}. \]

We see that f_{n} converges to f uniformly on A:=\bigcap_{m}A_{m,N_{m}}, where N_{m} will be chosen later. First note that A_{m,N} is measurable: the limit of measurable functions is measurable, so f and thus f-f_{n} is meaurable, implying that A_{m,N}, the pre-image of the open interval (-1/m,1/m) under f_{n}-f, is measurable. Since A is a countable intersection of A_{m,N_{m}}, A is also measurable. Second, note that A_{m,1}\subset A_{m,2}\subset\ldots is an increasing sequence, and its complement X\backslash A_{m,1}\supset X\backslash A_{m,2}\supset\ldots is a decreasing sequence. Since f_{n} converges to f, X\backslash A_{m,n}\rightarrow\emptyset as n\rightarrow\infty. Since m(X)<\infty, by the downward monotone convergence of measure, m(X\backslash A_{m,n})\rightarrow0 as n\rightarrow\infty. Thus, there exists some N_{m} such that m(X\backslash A_{m,N_{m}})<\eps\cdot2^{-m}.

Then

    \begin{align*} m(X\backslash A) & =m\left(\bigcup_{m}X\backslash A_{m,N_{m}}\right)\\  & \leq\sum_{m}m(X\backslash A_{m,N_{m}})\\  & \leq\sum_{m}\frac{\eps}{2^{m}}\\  & =\eps, \end{align*}

where the lines follow by the definition of A, sub-additivity of measure, construction of N_{m}, and geometric series, respectively.

It is easy to extend the above setting directly to an abstract one, with the proof following verbatim as before, except replacing the Lebesgue measure m by an arbitrary measure \mu:

(Abstract Egorov’s Theorem) Let (X,\mathcal{B},\mu) be a measure space with \mu(X)<\infty. Let f_{n}:X\rightarrow\mathbb{R} be a sequence of measurable functions, with f_{n} converging to f pointwise as n\rightarrow\infty. Then for every \eps>0, there exists a measurable subset A\subset X with m(X\backslash A)\leq\eps, where f_{n} converges to f uniformly within A.
When the measure space is a probability space, the finiteness of measure is trivially true. Since random variables are measurable by definition, Egorov’s Theorem states that a sequence of converging random variables is then almost uniformly convergent, except on a subset with small probability.

Lastly, we can lift the restriction that the domain of f_{n} has finite measure, but uniform convergence must still be local.

(Egorov’s Theorem II) Let f_{n}:\mathbb{R}^{d}\rightarrow\mathbb{R} be a sequence of measurable functions with f_{n} converging to f pointwise as n\rightarrow\infty. Then for every \eps>0, there exists a measurable subset A\subset X with m(X\backslash A)\leq\eps, where f_{n} converges to f locally uniform within A, i.e., for every bounded subset B\subset A, f_{n} converges to f uniformly on B.
We will simply describe how to modify the preceding proof. For every positive integer m, let B_{m} be the ball of radius of m centered at the origin. Define E_{m,n}=B_{m}\bigcap(\mathbb{R}^{d}\backslash A_{m,n}). Then following the same argument applied to E_{m,n}, there exists some N_{m} such that m(E_{m,N_{m}})\leq\eps2^{-m}. Then letting E=\bigcup_{m}E_{m,N_{m}}, E has measure at most \eps. By definition,

    \[ \mathbb{R}^{d}\backslash E=\bigcap_{m}\left(A_{m,N_{m}}\bigcup\mathbb{R}^{d}\backslash B_{m}\right), \]

so any bounded B must be inside B_{m_{0}} for some m_{0}, so B\backslash E=B\bigcap\left(\bigcap_{m}A_{m,N_{m}}\right), i.e., f_n converges to f uniformly on B.

In both variations of Egorov’s Theorem, the boundedness condition is necessary. Otherwise, when d=1, consider f_{n}(x)=1_{\{x\in[n,n+1)\}}, which converges to the zero function pointwise, but can never converge uniformly to zero. One can see that the mass of this function “escapes” to horizontal infinity. This is exemplified in the proof above, where convergence is not uniform on \mathbb{R}^{d}\backslash E=\bigcap_{m}\left(A_{m,N_{m}}\bigcup\mathbb{R}^{d}\backslash B_{m}\right): a sequence of points x_{n} with |x_{n}|=n+1 can all reside in \mathbb{R}^{d}\backslash E but remain outside of \bigcap_{m}A_{m,N_{m}}.

Dirichlet’s Theorem for arithmetic progressions with difference 4

From antiquity, Euclid’s Theorem asserts that the number of primes is infinite. Since all primes except 2 are odd, a re-phrasement of Euclid’s theorem is that the set of odd integers \{2k+1\}_{k\in\mathbb{Z}} contains an infinite number of primes. A natural generalization, proved by Dirichlet, is that every arithmetic progression of the form \{qk+\ell\}_{k\in\mathbb{Z}} contains an infinite number of primes, provided that q and k are relatively prime.

(Dirichlet’s Theorem) If (q,\ell)=1, then the arithmetic progression \{qk+\ell\}_{k\in\mathbb{Z}} contains infinitely many primes, where (q, \ell) denotes the greatest common divisor of q and \ell.

The proof is one of the first application of Fourier analysis on finite abelian groups. Here in this post, we prove Dirichlet’s theorem when q=4, i.e., both the arithmetic progressions \{4k+1\}_{k\in\mathbb{Z}} and \{4k+3\}_{k\in\mathbb{Z}} contain an infinite number of primes. The restriction to q=4 highlights the main structure of Dirichlet’s proof without needing as much background from number theory.

Before jumping directly into the proof, let us first discuss some number-theoretic background and the high-level strategy. First recall Euclid’s argument. For the sake of contradiction, there are exactly N primes p_{1},\ldots,p_{N} of the 2k+1. Then the number M:=2\prod_{i=1}^{N}p_{i}+1 is larger than every p_{i} and cannot be a prime of the form 2k+1. On the other hand, from the Fundamental Theorem of Arithmetic, M is the product of primes, and one of them must be congruent to 1 \Mod 2 since M itself is. So some p_{i} must divide both M and and 2\prod_{i=1}^{N}p_{i}, implying that p divides 1=M-2\prod_{i=1}^{N}p_{i}, a contradiction. One can easily verify that this argument can be adapted to \{4k+3\}_{k\in\mathbb{Z}} by taking M=4\prod_{i=1}^{N}p_{i}+3. However, the argument breaks down when applied to the arithmetic progression \{4k+1\}_{k\in\mathbb{Z}}. The reason is that the number 4\prod_{i=1}^{N}p_{i}+1 is no longer guaranteed to have a prime divisor of the form 4k+1.

Instead, Euler’s analytic proof of Euclid’s Theorem provides the proper framework to extend the infinitude of primes to arithmetic progressions, where the zeta function \zeta(s)=\sum_{n=1}^{\infty}\frac{1}{n^{s}} can be written as the product of geometric series of primes:

(Euler-Dirichlet Product Formula) For every s>1,

    \[ \sum_{n=1}^{\infty}\frac{1}{n^{s}}=\prod_{p}\frac{1}{1-\frac{1}{p^{s}}}. \]

Furthermore, if f:\mathbb{Z}\rightarrow \mathbb{R} is multiplicative, i.e., f(xy)=f(x)f(y), then

    \[ \sum_{n=1}^{\infty}\frac{f(n)}{n^{s}}=\prod_{p}\frac{1}{1-\frac{f(p)}{p^{s}}}. \]

(A note on notation: p always denotes a prime, and \sum_{p} and \prod_{p} range over all primes.) The identity can be viewed as an analytic version of the Fundamental Theorem of Arithmetic: every positive integer is the product of some prime powers, and the expression 1/(1-p^{-s}) is a geometric series. Since \zeta(1) diverges, The Product Formula also implies that the number of primes is infinite. With additional calculation, Euler proved the stronger form of Euclid’s Theorem:

    \[ \sum_{p}\frac{1}{p^{s}}\rightarrow\infty \text{ as } s\rightarrow1^{+}. \]

Now to prove Dirichlet’s Theorem, we will show that the sum \sum_{p}\frac{D(p)}{p^{s}}\rightarrow\infty as s\rightarrow1^{+}, where D_{\ell} is the indicator function on \mathbb{Z} for being congruent to \ell \Mod q. By Fourier analysis, \sum_{p}\frac{D_{\ell}(p)}{p^{s}} can be decomposed into two sums. Roughly, the first corresponds to the zeroth Fourier coefficient of D_{\ell} and evaluates to \sum_{p}\frac{1}{p^{s}}, which diverges as s\rightarrow1^{+}, and the second sum corresponds to the non-zero Fourier coefficients and is always bounded. In our writeup, the advantage of restricting to q=4 is that the estimate of the second sum becomes significantly simpler, without further tools from number theory.

(Proof of Dirichlet’s Theorem restricted to q=4.) Let G:=\mathbb{Z}_{4}^{*} be the multiplicative group of integers modulo 4, and \delta_{\ell}:G\rightarrow\{0,1\} be the indicator function such that \delta_{\ell}(x)=1 if x\equiv\ell\Mod 4 and 0 otherwise. Then by Fourier expansion,
for every n\in G,

    \begin{eqnarray*} \delta_{\ell}(n) & = & \sum_{e\in\hat{G}}\hat{\delta_{\ell}}(e)e(n),\\  & = & \frac{1}{|\hat{G}|}\sum_{e\in\hat{G}}\overline{e(\ell)}e(n), \end{eqnarray*}

where e ranges over all characters of the dual group of G. Since G is a finite abelian group, it is is also isomorphic to its dual, and with |G|=2, we can conclude that the only characters of \hat{G} are: the trivial character e_{0} that is identically 1, and the character e_{1} where e_{1}(x)=1 if
x\equiv1\Mod 4 and -1 if x\equiv3\Mod 4. Thus, we can simplify and rewrite \delta_{\ell} as

    \begin{eqnarray*} \delta_{\ell}(n) & = & \frac{1}{2}\left(\overline{e_{0}(\ell)}e_{0}(n)+\overline{e_{1}(\ell)}e_{1}(n)\right)\\ & = & \frac{1}{2}\left(e_{0}(n)+e_{1}(\ell)e_{1}(n)\right). \end{eqnarray*}

(For sanity check, one can easily verify via direct computation, without Fourier analysis, that the identities \delta_{1}(n)=\frac{1+e_{1}(n)}{2} and \delta_{3}(n)=\frac{1-e_{1}(n)}{2} hold.) For the rest of this proof, we will rely on the the fact that the characters e_{0} and e_{1} are real-valued.

To prove the theorem, we have to sum over all integers instead of just the multiplicative subgroup of \mathbb{Z}_{4}^{*}. To this end, for every character e\in\hat{G}, we define its Dirichlet character \chi to be the extension to all of \mathbb{Z} as follows: \chi(n)=e(n) if (n,4)=1 and 0 otherwise. Let D_{\ell}:\mathbb{Z}\rightarrow\{0,1\} be the indicator function of being congruent to \ell \Mod 4. Then it is easy to see that

    \[ D_{\ell}(n)=\frac{1}{2}\left(\chi_{0}(n)+\chi_{1}(\ell)\chi_{1}(n)\right), \]

where \chi_{0},\chi_{1} are the Dirichlet characters of e_{0},e_{1}, respectively. Since we can express the indicator function D_{\ell} in terms of Dirichlet characters, we have

    \begin{eqnarray*} \sum_{p\equiv\ell\Mod 4}\frac{1}{p^{s}} & = & \sum_{p}\frac{D_{\ell}(p)}{p^{s}}\\ & = & \frac{1}{2}\sum_{p}\frac{\chi_{0}(p)}{p^{s}}+\frac{\chi_{1}(\ell)}{2}\sum_{p}\frac{\chi_{1}(p)}{p^{s}}. \end{eqnarray*}

It suffices to prove that \sum_{p\equiv\ell\Mod 4}\frac{1}{p^{s}} \rightarrow \infty as s\rightarrow1^{+}. In particular, we will show that

  • \sum_{p}\frac{\chi_{0}(p)}{p^{s}} \rightarrow \infty as s\rightarrow1^{+}, which coincides with Euler’s argument since \chi_{0}(p)=1 for all p > 2, and
  • \sum_{p}\frac{\chi_{1}(p)}{p^{s}}=O(1).

To estimate these sums, we prove the following:

Let \chi be a Dirichlet character for G. Then \sum_{p}\frac{\chi(p)}{p^{s}}=\log(L(s,\chi))+O(1), where L(s,\chi)=\sum_{n=1}^{\infty}\frac{\chi(n)}{n^{s}}.
(of Proposition) Since \chi is multiplicative, from the Euler-Dirichlet Product Formula,

    \[ L(s,\chi)=\prod_{p}\frac{1}{1-\frac{\chi(p)}{p^{s}}}. \]

Since \chi is real-valued, taking log of both sides, we have

    \begin{eqnarray*} \log(L(s,\chi)) & = & \sum_{p}-\log\left(1-\frac{\chi(p)}{p^{s}}\right)\\ & = & \sum_{p}\frac{\chi(p)}{p^{s}}+O\left(\frac{\chi(p)^{2}}{2p^{2s}}\right)\\ & = & \sum_{p}\frac{\chi(p)}{p^{s}}+O(1), \end{eqnarray*}

where the second line follows from the estimate of the power series \log(1-x)=\sum_{n=1}^{\infty}x^{n}/n=x+O(x^{2}) when |x|<1, and the third from the fact that \sum_{n}\frac{1}{n^{2}} converges.

Now we estimate the two sums using the above proposition. For the trivial Dirichlet character \chi_{0}, we have

    \begin{eqnarray*} \sum_{p}\frac{\chi_{0}(p)}{p^{s}} & = & \log\left(\sum_{n=1}^{\infty}\frac{\chi_{0}(n)}{n^{s}}\right)+O(1)\\ & = & \log\left((1-\frac{1}{2^{s}})\sum_{n=1}^{\infty}\frac{1}{n^{s}}\right)+O(1)\\ & = & \log\left(\sum_{n=1}^{\infty}\frac{1}{n^{s}}\right)+O(1), \end{eqnarray*}

which diverges as s\rightarrow1^{+} since \sum_{n}\frac{1}{n^{s}} diverges as s\rightarrow1^{+}.

For the non-trivial character \chi_{1}, \sum_{n}\frac{\chi_{1}(n)}{n} is simply the alternating series 1-\frac{1}{3}+\frac{1}{5}-\frac{1}{7}+\dots, which converges to a positive number. Thus,

    \begin{eqnarray*} \sum_{p}\frac{\chi_{1}(p)}{p^{s}} & = & \log\left(\sum_{n=1}^{\infty}\frac{\chi_{1}(n)}{n^{s}}\right)+O(1)\\ & \leq & \log\left(\sum_{n=1}^{\infty}\frac{\chi_{1}(n)}{n}\right)+O(1)\\ & = & O(1), \end{eqnarray*}

completing the proof.

Most steps in the preceding proof can be easily extended to an arbitrary q. The Euler-Dirichlet Product Formula works for a multiplicative function taking values in \mathbb{C}. The Fourier expansion of D_{\ell} works as before. The Proposition \sum_{p}\frac{\chi(p)}{p^{s}}=\log(L(s,\chi))+O(1) still holds, but extra care has to be taken since the L-function is now complex-valued. The sum \sum_{p}\frac{\chi_{0}(p)}{p^{s}} diverges, but for a non-trivial Dirichlet character \chi, the fact that the sum \sum_{p}\frac{\chi(p)}{p^{s}} is bounded is significantly more involved. See for instance Apostol or Stein-Shakarchi for further details.

We finish by observing that the proof above has a quantitative form.

(Quantitative version of Dirichlet’s Theorem) For every s > 1,

    \[ \sum_{p\equiv\ell\Mod 4}\frac{1}{p^{s}}=\frac{1}{2}\log\frac{1}{s-1}+O(1). \]

In general, for an arbitrary q, the expression \frac{1}{2}\log\frac{1}{s-1} is replaced by \frac{1}{|\mathbb{Z}_q^*|}\log\frac{1}{s-1}, with |\mathbb{Z}_q^*|=\phi(q), where \phi is the Euler totient function.
Proof proceeds the same as before, where we showed that

    \[ \sum_{p\equiv\ell\Mod 4}\frac{1}{p^{s}}=\frac{1}{2}\sum_{p}\frac{\chi_{0}(p)}{p^{s}}+\frac{\chi_{1}(\ell)}{2}\sum_{p}\frac{\chi_{1}(p)}{p^{s}}, \]

with \sum_{p}\frac{\chi_{0}(p)}{p^{s}}=\log\left(\sum_{n=1}^{\infty}\frac{1}{n^{s}}\right)+O(1) and \sum_{p}\frac{\chi_{1}(p)}{p^{s}}=O(1). Previously, we simply noted that \sum_{n=1}^{\infty}\frac{1}{n^{s}} diverges as s\rightarrow1^{+}. To finish the proof, it suffices to note that

    \begin{eqnarray*} \sum_{n=1}^{\infty}\frac{1}{n^{s}} & \leq & \sum_{n=1}^{\infty}\frac{1}{(n+1)^{s}}+O(1)\\ & \leq & \int_{1}^{\infty}\frac{1}{x^{s}}+O(1)\\ & = & \frac{1}{s-1}+O(1). \end{eqnarray*}

Exponential sum and the equidistribution theorem

Consider the sequence x \Mod 1, 2x \Mod 1, 3x \Mod 1,\ldots, where \Mod 1 gives the fractional part of a real number. The equidistribution theorem states that x is irrational iff for any sub-interval [a, b) of the unit interval, for sufficiently large N, roughly (b-a)N of the numbers x\Mod 1, \ldots, Nx\Mod 1 fall inside [a, b). More precisely,

(Equidistributed sequence) A sequence \{x_n\}_{n=1}^{\infty} with x_n \in \mathbb{R} is equidistributed mod 1 if for every 0 \leq a \leq b \leq 1,

    \[ \lim_{N \rightarrow \infty} \frac{\abs{\{x_n : a \leq x_n \Mod 1 < b \}}}{N}  = b - a. \]

(Equidistribution theorem) The sequence \{nx\}_{n=1}^{\infty} is equidistributed mod 1 iff x is irrational.

Let us try to understand why the sequence behaves differently depending on whether x is rational or not. If x = p/q for some integers p and q, then roughly N/q of the first N multiples of x are integers, i.e., 0 \Mod 1. In fact, it is not difficult to see that for i \in \{0, 1, \ldots, q-1\}, 1/q fraction of the sequence equals i/q \Mod 1. So equidistribution does not occur when x is rational.

Now for k\neq 0 \in \mathbb{Z}, consider the exponential function e_k(x):=e^{2\pi i k x}, which has period 1. As discussed above, roughly 1/q of the sequence \{e_k(nx)\}_{n=1}^{\infty} is 1, so the exponential sum \sum_{n=1}^{\infty} e_k(nx) contains an infinite number of ones. For example if q=2, the exponential sum evaluates to -1 + 1 - 1 + 1 - \ldots, which is Grandi’s series and diverges. For an arbitrary q > 1, the exponential sum is a repeating sum of all q-th roots of unity, and one can use the same technique showing that Grandi’s series diverges to establish that this exponential sum diverges too. However, the case when x is irrational differs: every nonzero integer multiple of x is an non-integer, and thus e_k(nx)\neq 1 for any n\neq 0. Writing r=e^{2\pi i x}, the exponential sum \sum_{n=1}^{\infty} e_k(nx) converges to the geometric series \sum_{n=1}^{\infty} r^n = \frac{r}{r-1}, since r\neq1.

In fact, the above intuition can be summarized into the following criterion due to Weyl:

(Weyl’s criterion) A sequence \{x_n\}_{n=1}^{\infty} is equidistributed mod 1 iff for all integers k\neq 0,

    \[ \lim_{N\rightarrow \infty} \frac{1}{N} \sum_{n=1}^N e^{2\pi i kx_n} = 0. \]

From the above discussion, Weyl’s criterion immediately implies the equidistribution theorem.

(of equidistribution theorem) Suppose x is rational, i.e., x=p/q for some integers p and q. Without loss of generality, we assume q > 1. Then for all integer n, by Euclid’s algorithm, nx \Mod 1 is in \{0, 1/q, \ldots, (q-i)/q\}. So the sequence \{nx\}_{n=1}^{\infty} is not equidistributed, e.g., no multiples of x fall inside [1/4q, 1/2q).

Now suppose x is irrational. Since r:= e^{2\pi i kx}\neq 1, we have

    \[ \frac{1}{N} \sum_{n=1}^N e^{2\pi i kx_n} = \frac{1}{N} \frac{r^{N+1} - 1}{r - 1}, \]

which tends to 0 as N \rightarrow \infty.

Before proving Weyl’s criterion, it is instructive to state the notion of equidistribution analytically. For 0\leq a \leq b \leq 1, let \chi_{[a,b)}(x)=1 iff x \Mod 1 \in [a, b). Note that \chi_{[a,b)} is a periodic function on all of \mathbb{R}.

(Weyl’s criterion restated) Let \{x_n\}_{n=1}^{\infty} be a sequence in \mathbb{R}. The following two statements are equivalent:
1) For 0\leq a < b \leq 1,

    \[ \lim_{N\rightarrow \infty} \frac{1}{N} \sum_{n=1}^N \chi_{[a, b)}(x_n) = \int_0^1 \chi_{[a, b)}. \]

2) For all integers k\neq 0,

    \[ \lim_{N\rightarrow \infty} \frac{1}{N} \sum_{n=1}^N e^{2\pi i kx_n} = \int_0^1 e^{2\pi i k}. \]

Written in this form, Weyl’s criterion is asserting that if the sum of every interval function at a sequence converges to its integral, then every exponential sum converges to its integral as well, and vice versa. The reason is that both the space of interval functions and the space of exponential functions can approximate one another. More precisely, an exponential function is clearly periodic and continuous (and thus uniformly continuous), and every uniformly continuous function can be approximated by a finite sum of interval functions. For the other direction, a periodic, interval function can be approximated by a periodic, continuous function, which by Fejér’s Theorem can also be approximated by a finite sum of exponential functions. Now we have all the machinery to write down a proof.

(of Weyl’s criterion) For the forward direction (1) => (2), fix k\neq 0 and \epsilon > 0. Since the exponential function e_k(x):=e^{2\pi i kx} continuous on the closed interval [0, 1], it is also uniformly continuous, i.e., there exists some \delta > 0 such that for every x, x', |e_k(x) - e_k(x')| < \epsilon. Let f_S be a step function where f_S(x) = e_k(m \delta) if x \in [m\delta, (m+1)\delta). By construction, f_S is a finite sum of interval functions with \norm{f_S - e_k}_{\infty} < \epsilon.

By linearity, we have

    \[ \lim_{N\rightarrow \infty}\frac{1}{N} \sum_{n=1}^N f_S(x_n) = \int_0^1 f_S. \]

By taking N sufficiently large and repeatedly applying triangle inequality, we have

    \begin{eqnarray*} \abs{\frac{1}{N} \sum_{n=1}^N e_k(x_n) - \int_0^1 e_k} & \leq & \abs{\frac{1}{N} \sum_{n=1}^N e_k(x_n) - \frac{1}{N} \sum_{n=1}^N f_S(x_n)} + \\ & & \abs{\frac{1}{N} \sum_{n=1}^N f_S(x_n) - \int_0^1 e_k} \\ & \leq & \epsilon + \abs{\frac{1}{N} \sum_{n=1}^N f_S(x_n) - \int_0^1 f_S} + \abs{\int_0^1 f_S - \int_0^1 e_k} \\ & \leq & 3\epsilon, \end{eqnarray*}

implying that \frac{1}{N} \sum_{n=1}^N e_k(x_n) converges to \int_0^1 e_k as N\rightarrow \infty.

Now we consider the reverse direction (2) => (1). Fix 0\leq a < b \leq 1 and let \epsilon > 0. Define

    \[ f^+(x) =  \begin{cases} 1 & \text{if } x \in [a, b) \\ 1/\epsilon (x - (a - \epsilon)) & \text{if } x \in [a - \epsilon, a) \\ -1/\epsilon (x - (b - \epsilon)) + 1 & \text{if } x \in [b, b + \epsilon) \\ 0 & \text{everywhere else.} \end{cases} \]

Note that f^+ is continuous, periodic, and dominates \chi_{[a, b)}. By Fejér’s Theorem, there exists P(x) = \sum_{k=1}^d p_k e^{2\pi i kx} such that \norm{f^+ - P}_{\infty} < \epsilon. By linearity, we have

    \[ \lim_{N\rightarrow \infty}\frac{1}{N} \sum_{n=1}^N P(x_n) = \int_0^1 P. \]

By taking N sufficiently large and repeatedly applying triangle inequality, we have

    \begin{eqnarray*} \abs{\sum_{n=1}^N f^+(x_n) - \int_0^1 f^+} & \leq & \abs{\sum_{n=1}^N f^+(x_n) - \sum_{n=1}^N P(x_n)} + \abs{\sum_{n=1}^N P(x_n) - \int_0^1 f^+} \\ & \leq & \epsilon + \abs{\sum_{n=1}^N P(x_n) - \int_0^1 P} + \abs{\int_0^1 P - \int_0^1 f^+} \\ & \leq & 3\epsilon, \end{eqnarray*}

implying that

    \[ \lim_{N\rightarrow \infty}\frac{1}{N} \sum_{n=1}^N f^+(x_n) = \int_0^1 f^+. \]

Since f^+ dominates \chi_{[a, b)}, we have

    \begin{eqnarray*} \limsup_{N \rightarrow \infty}  \frac{1}{N} \sum_{n=1}^N \chi_{[a,b)}(x_n) & \leq & \limsup_{N \rightarrow \infty}  \frac{1}{N} \sum_{n=1}^N f^+(x_n) \\ & = & \int_0^1 f^+ \\ & = & \int_0^1 \chi_{[a, b)} + \epsilon. \end{eqnarray*}

One can similarly define a periodic, continuous function f^- that approximates \chi_{[a, b)} from below, showing that \liminf_{N \rightarrow \infty}  \frac{1}{N} \sum_{n=1}^N \chi_{[a,b)}(x_n) is bounded below by \int_0^1 \chi_{[a, b)} - \epsilon, which together with the proceeding argument will show that the limit of \frac{1}{N} \sum_{n=1}^N \chi_{[a,b)}(x_n) must exist and converge to \int_0^1 \chi_{[a, b)}.

Summability, Tauberian theorems, and Fourier series

Consider a periodic function f. Its Fourier series \sum_{n=-\infty}^{\infty} \hat{f}(n) e^{2\pi inx} always exists formally, but questions of whether its Fourier series converges to itself can be rather subtle. In general, f may be different from its Fourier series: a Fourier coefficient is the evaluation of an integral, so f can be changed pointwise without affecting the value of any Fourier coefficient. It may be natural to believe that convergence holds when f is continuous. However, an example constructed by du Bois-Reymond showed that continuity is not sufficient. The most general result in this area is Carleson’s theorem, showing that L^2-functions have convergent Fourier series almost everywhere.

Here we will prove that the Fourier series of f converges to f in the sense of Cesàro (or Abel), and we will highlight through a Tauberian-type theorem that convergence holds when f is “smooth”. In particular, the main theorem we will show is that for a periodic, continuous function, its Fourier series converges pointwise if its Fourier coefficients are decaying at a linear rate.

If f:[-\pi, \pi] \rightarrow \mathbb{C} and \abs{\hat{f}(n)}=o(1/n) and f is continuous at x, then \sum_{|n| \leq N} \hat{f}(n)e^{2\pi inx} \rightarrow f(x) as N\rightarrow \infty.
The sufficient condition o(1/n) in the above theorem can be improved to O(1/n), but we won’t prove the stronger statement here.

What types of functions are characterized by decaying Fourier coefficients?
A simple calculation with integration by parts shows that if f is differentiable, then \abs{n\hat{f}(n)} = \abs{\hat{f'}(n)}. Since \abs{\hat{f'}(n)}=o(1) by the Riemann-Lesbesgue lemma, we have the following:

If f:[-\pi, \pi] \rightarrow \mathbb{C} is differentiable, then its Fourier series converges pointwise to f.

In general, convergence also holds for other types of smoothness conditions.

Summability

We first take a detour (without the Fourier analysis setting) to investigate the different notions of summability for a sequence. The standard and familiar notion of summability captures the intuitive notion that as one adds more numbers from a sequence, the sum gets closer to its limit. More precisely,

A sequence \{a_n\}, a_n \in \mathbb{C} is summable to s if its sequence of partial sums \{s_n\} where s_n = \sum_{i=1}^n a_i converges to s.

Under this definition, the partial sums of Grandi’s series \{1, -1, 1, -1, \ldots \} oscillate between 1 and 0, and thus the series diverges. However, one could argue that this series “hovers” around 1/2 since the partial sums oscillate between 1 and 0. In fact, the average partial sums of the sequence \{1, 0, 1, 0, \ldots \} converge to 1/2. This motivates the following alternate definition of summability:

A sequence \{a_n\}, a_n \in \mathbb{C} is Cesàro-summable to s if its sequence of Cesàro means \{\sigma_N\} converges to s, where \sigma_N = \frac{1}{N} \sum_{n=1}^N s_n and s_n is the n-th partial sum \sum_{i=1}^n a_i.

It is instructive to rewrite the Cesàro mean as

    \begin{eqnarray*} \sigma_N & = & \frac{1}{N} \sum_{n=1}^{^N} s_n \\  & = & \frac{1}{N} \sum_{n=1}^{N} \sum_{i=1}^{n} a_i \\  & = & \sum_{n=1}^{N}  \left(\frac{N - n + 1}{N}\right) a_n. \end{eqnarray*}

We can see that the Cesàro mean is a weighted average of the a_n‘s, where the later terms contribute with a discount linear factor. With this insight, it is intuitive to conclude if a sequence is summable, then it is also Cesàro-summable: a summable sequence must have vanishing “tail”, implying that its Cesàro means form a Cauchy sequence and thus converge.

We can consider another definition of summability where the discount factor is exponential:

A sequence \{a_n\}, a_n \in \mathbb{C} is Abel summable to s if for every r \in [0, 1), A(r):=\sum_{n=1}^{\infty} r^n a_n converges and \lim_{r \rightarrow 1} A(r) = s.

Intuitively, Abel summability ought to capture a larger class of sequences. In fact, one can summarize this entire section in the following:

If a sequence is summable to s, then it is Cesàro-summable to s. If a sequence is Cesàro-summable to s, then it is also Abel-summable to s.
Suppose first that \{a_n\} is summable to s. For every \epsilon > 0, there exists N_0 such that for every N\geq N_0, |s_n - s| \leq \epsilon / 2. Let D:= \sum_{i=1}^{N_0} |s_i - s| and N_1 := \max(N_0, 2D/\epsilon). Then for every N \geq N_0, we have

    \begin{eqnarray*} \abs{\sigma_N - s} & = & \abs{\frac{1}{N} \sum_{n=1}^N (s_n - s)} \\  & \leq & \frac{1}{N} \sum_{n=1}^N \abs{s_n - s} \\  & = & \frac{1}{N} \sum_{i=1}^{N_0} \abs{s_n - s} + \frac{1}{N} \sum_{n > N_0}^N \abs{s_n - s} \\  & \leq & \frac{1}{N} \sum_{n=1}^{N_0} \abs{s_n - s} + \frac{\epsilon}{2}, \end{eqnarray*}

and when N is sufficiently large, \sum_{n=1}^{N_0} \abs{s_n - s} \leq  \epsilon N/2,
implying that \sigma_n \rightarrow s as n \rightarrow \infty.

Now suppose that \{\sigma_n\} converges to s. Define s_0 = 0. Then we can write

    \begin{eqnarray*} A(r) & = & \sum_{n=1}^{\infty} a_n r^n \\  & = & \sum_{n=1}^{\infty} (s_n - s_{n-1}) r^n \\  & = & \sum_{n=1}^{\infty} s_n r^n - r \sum_{n=1}^{\infty} s_{n-1} r^{n-1} \\  & = & (1 - r) \sum_{n=1}^{\infty} s_n r^n. \end{eqnarray*}

Since s_n = n \sigma_n - (n-1) \sigma_{n-1}, using a similar calculation as above, we can write

    \[ A(r) = (1-r)^2 \sum_{n=1}^{\infty} n \sigma_n r^n. \]

Since \{\sigma_n\} converges (and is thus bounded), \sum_{n=1}^{\infty} n \sigma_n r^n converges for every r \in [0, 1) and so does A(r).

Now it remains to show that \lim_{r\rightarrow 1} A(r) = s. Let \epsilon > 0, then there exists N such that for each n \geq N, |\sigma_n - s| \leq \epsilon. We can split A(r) into two sums:

    \[ (1-r)^2 \sum_{n=1}^{N} n \sigma_n r^n + (1-r)^2 \sum_{n>N}^{\infty} n \sigma_n r^n. \]

Since one can exchange limit with finite sums, \lim_{r\rightarrow 1}(1-r)^2 \sum_{n=1}^{N} n \sigma_n r^n is 0. So

    \begin{eqnarray*} \lim_{r\rightarrow 1} A(r) & \leq & (s + \epsilon) \lim_{r\rightarrow 1} (1-r)^2 \sum_{n>N}^{\infty} n r^n \\  & = & (s + \epsilon) \lim_{r\rightarrow 1} (1-r)^2 r \sum_{n>N}^{\infty} n r^{n-1} \\  & = & (s + \epsilon) \lim_{r\rightarrow 1} (1-r)^2 r \left( \sum_{n>N}^{\infty} r^n \right)^' \\  & = & (s + \epsilon) \lim_{r\rightarrow 1} (1-r)^2 r \left( \frac{r^{N+1}}{1-r} \right)^' \\  & = & (s + \epsilon) \cdot 1. \end{eqnarray*}

A similar shows that \lim_{r\rightarrow 1} A(r) \geq s - \epsilon, and hence \{a_n\} is Abel summable to s.

A Tauberian-type theorem

In the previous section, we showed that the class of summable sequences is a subset of Cesàro-summable sequeences, which in turn is a subset of Abel-summable sequences. It is not difficult to see that these containments are strict. However, by imposing certain conditions, these containments can also be reversed, and statements of this type are called “Tauberian-type theorems”. Here we prove a simple version where the magnitude of the sequence decays linearly.

(Tauberian) Suppose the sequence \{a_n\} satisfies \abs{a_n} = o(1/n).
(a) If \{a_n\} is Cesàro-summable to s, then \{a_n\} is also summable to s.
(b) If \{a_n\} is Abel-summable to s, then \{a_n\} is also summable to s.
(a) Fix \eps > 0. Let \sigma_N, s_N be the Cesàro mean and partial sum of \{a_n\}, respectively. By assumption, \abs{\sigma_N -s} \leq \eps / 2 when N is sufficiently large, and by triangle inequality, \abs{s_N - s} is at most \abs{s_N - \sigma_N} + \eps / 2, so it suffices to prove that \abs{s_N - \sigma_N} \leq \eps / 2.

Recall that in the previous section, we proved that

    \[ \sigma_N = \sum_{n=1}^{N}  \left(\frac{N - n + 1}{N}\right) a_n. \]

Then we have

    \begin{eqnarray*} \abs{s_N - \sigma_N} & = & \abs{\sum_{n=1}^N a_n - \sum_{n=1}^N \frac{N - n + 1}{N} a_n} \\ & = & \sum_{n=1}^N \frac{n - 1}{N} \abs{a_n}. \end{eqnarray*}

By assumption, there is some N_0 such that for every n \geq N_0, (n-1)\abs{a_n} \leq \eps/4, implying \sum_{n=N_0}^N \frac{n - 1}{N} \abs{a_n} \leq \eps / 4. Then when N is sufficiently large, \sum_{n<N_0} \frac{n - 1}{N} \abs{a_n} \leq \eps / 4, proving that \sum_{n=1}^N \frac{n - 1}{N} \abs{a_n} \leq \eps / 2 as desired.

(b) Let A(r) = \sum_{n=1}^N r^n a_n. Define r_N = 1- 1/N. By assumption, when N is sufficiently large, \abs{A_N(r_N) - s} \leq \eps/2, and by triangle inequality, \abs{s_N - s} is at most \abs{s_N - A_N(r_N)} + \eps/2, so it suffices to prove that \abs{s_N - A_N(r_N)} \leq \eps/2. By triangle inequality again, we have

    \begin{eqnarray*} \abs{s_N - A_N(r_N)} & = & \abs{\sum_{n=1}^N a_n - r_N^n a_n} \\ & \leq & \sum_{n=1}^N \paren{1 - r_N^n} \abs{a_n}. \end{eqnarray*}

By assumption, there is some N_0 such that for every n \geq N_0, (n-1)\abs{a_n} \leq \eps/4. Again we can break the the above sum into two, and for the later terms, we have

    \begin{eqnarray*} \sum_{n=N_0}^N \paren{1 - r_N^n} \abs{a_n} & = & \sum_{n=N_0}^N \paren{1-r_N}\paren{1 + r_N + \ldots + r_N^{n-1}} \abs{a_n} \\ & \leq & \sum_{n=N_0}^N \frac{n|a_n|}{N} \\ & \leq & \eps/4. \end{eqnarray*}

For the initial terms, we have

    \begin{eqnarray*} \sum_{n<N_0} \paren{1 - r_N^n} \abs{a_n} & \leq & \paren{1-r_N^{N_0}} \sum_{n<N_0}\abs{a_n} \\ & = & \paren{1-\paren{1 - \frac{1}{N}}^{N_0}} \sum_{n<N_0}\abs{a_n} \\ & \leq & \frac{N_0}{N} \sum_{n<N_0}\abs{a_n}, \end{eqnarray*}

where the last inequality is due to Bernoulli’s inequality, and is bounded above by \eps/4 when N is sufficiently large. Together we have \abs{s_N - A_N(r_N)} \leq \eps/2 as desired.

Fourier series

We now come back to the Fourier setting. First recall that

For a periodic function f, we define its N-th partial Fourier sum to be S_N(f)(x):= \sum_{|n| \leq N -1} \hat{f}(n) e_n(x), with e_n(x):=e^{2\pi inx}, and \hat{f}(n)=\frac{1}{2\pi}\int_{-\pi}^{\pi} f(x)e^{-inx} dx.

We now introduce Dirichlet and Fejér kernels, which when convolved with a function f, we obtain the partial sum and Cesàro mean of f, letting us invoke Tauberian Theorem.

D_N(x):= \sum_{|n| \leq N-1} e_n(x) is the N-th Dirichlet kernel, and F_N(x):= \frac{1}{N} \sum_{|n| \leq N-1} D_n(x) is the N-th Fejér kernel.
(Tauberian theorem restated) Let f be a periodic function with \abs{\hat{f}(n)}=o(1/n). Then for any x \in [-\pi, \pi], if (F_N*f)(x)\rightarrow s for some s as N\rightarrow \infty, then S_N(f)(x) \rightarrow s as N\rightarrow \infty.
For any nonnegative integer n, let a_n := \hat{f}(n) e_n(x) + \hat{f}(-n) e_n(-x). Note that by assumption a_n = o(1/n). By the Convolution Theorem, the N-th partial sum of \{a_n\} is simply

    \[ S_N(f)(x) = (D_N * f)(x), \]

and similarly, the N-th Cesàro mean of \{a_n\} is

    \begin{eqnarray*} \frac{1}{N} \sum_{n=0}^{N-1} \sum_{i=0}^n a_i & = & \frac{1}{N} \sum_{n=0}^{N-1} (D_n * f)(x) \\ & & = (F_N * f)(x), \end{eqnarray*}

where the second line follows from the linearity of convolution. Thus, by Tauberian theorem, if (F_N*f)(x) converges, then S_N(f)(x) converges to the same value as well.

To prove the main theorem, it now suffices to show that (F_N*f)(x) approaches f(x). We make one last definition:

Let K_N be a periodic function. We say that \{K_N\}_{n=0}^{\infty} is a family of good kernels if the following three conditions all hold:
(a) for each N\geq 0, \frac{1}{2\pi} \int_{-\pi}^{\pi} K_N = 1,
(b) for each N\geq 0, \int_{-\pi}^{\pi} |K_N| = O(1), and
(c) for any \delta > 0, \int_{\delta \leq |x| \leq \pi} |K_N| \rightarrow 0 as N \rightarrow \infty.

The idea is that a family of good kernels essentially approximate the Dirac delta function, which is a single point with infinite mass at 0 and vanishes everywhere else.

The Fejér kernels \{F_N\}_{N=0}^{\infty} is a family of good kernels.

We will skip the proof of this fact as it is a straightforward calculation once one establishes the trignometric identity F_N(x) = \frac{\sin^2 (Nx/2)}{N \sin^2 (x/2)}. (However, the Dirichlet kernels cannot be a family of good kernels, otherwise every Fourier series converges to its function, which is not true.) Intuitively, since F_N*f(x) is a weighted average of f with highest weights around x (by the definition of a good family of kernels), we can expect that this value is close to f(x), as long as the f does not change its value too much around x. Now we formalize and prove this intuition, which will finish the proof of our main theorem.

Suppose \{K_N\}_{N=0}^{\infty} is a family of good kernels. Let f be a periodic, integrable function that is continuous at x. Then (K_N * f)(x) \rightarrow f(x) as N\rightarrow \infty.
We first write

    \begin{eqnarray*} \abs{(K_N*f)(x) - f(x)} & = & \frac{1}{2\pi} \abs{\int_{-\pi}^{\pi} K_N(y) f(x-y)dy - f(x)} \\ & = & \frac{1}{2\pi} \abs{\int_{-\pi}^{\pi} K_N(y) f(x-y)dy - \int_{-\pi}^{\pi} K_N(y) f(x)dy} \\  & \leq & \frac{1}{2\pi} \int_{-\pi}^{\pi} \abs{K_N(y)} \abs{f(x-y) - f(x)} dy, \end{eqnarray*}

where the first through third lines follow from the definition of convolution, Condition (a) of a good family of kernels, and triangle inequality.

The idea is that when integrating over y, \abs{K_N(y)} is small if y is bounded away from 0, and \abs{f(x-y) - f(x)} is small if y is close to 0. Formally, let \eps > 0. Since f is continuous at x, there is some \delta > 0 such that

    \[ \int_{|y| \leq \delta} \abs{K_N(y)} \abs{f(x-y) - f(x)} dy \leq \eps M, \]

for some M by Condition (b) of a good family of kernels. Since \eps can be arbitrary, the above integral is 0.

Now for the other values of y, note that since f is integrable, the function is bounded by some B. Then

    \[ \int_{\delta \leq |y| \leq \pi} \abs{K_N(y)} \abs{f(x-y) - f(x)} dy \leq  2B \int_{\delta |y| \leq \pi} \abs{K_N(y)} dy, \]

which tends to 0 as N \rightarrow \infty, finishing the proof.

In the previous section, we showed that the Abel mean can play a similar role to Cesàro. Similarly, one can define the Poisson kernel P_r(\theta) = \sum_{n=-\infty}^{\infty} r^{|n|} e^{in\theta} and show that it is a good family of kernels as r \rightarrow 1 to prove the main theorem through P_r instead.

The Basel problem via Fourier series

The Basel problem asks to compute the summation of the reciprocal of the natural numbers squared, and it is known that

    \[\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}.\]

While there are numerous known proofs (our presentation resembles Proof 9 most closely), we give a self-contained one that motivates the usage of Fourier analysis for periodic functions.

Recall that a function f:\mathbb{R}\rightarrow\mathbb{C} is 2\pi-periodic if for every x \in \mathbb{R}, f(x+2\pi)=f(x). From Fourier analysis, every such function can be expressed as a series of cosines and sines. More precisely,

If f is continuous and 2\pi-periodic, then

    \[\left\Vert f - \sum_{n=-\infty}^{\infty} \hat{f}(n) e_n\right\Vert_2 = 0,\]

where e_n(x) = e^{inx}, and \hat{f}(n) is the n-th Fourier coefficient equal to \frac{1}{2\pi}\int_{-\pi}^{\pi} f(x)e^{-inx} dx.

We say that \sum_{n=-\infty}^{\infty} \hat{f}(n) e_n is the Fourier series of f. When we define the inner product of two functions to be \frac{1}{2\pi}\int_{-\pi}^{\pi} f(x) \overline{g(x)} dx, then it is easily seen that the “characters” \{e_n\} form an orthonormal system, and the n-th Fourier coefficient can be viewed as the projection of f along the n-th character. The Fourier theorem asserts that f can be approximated in L_2-norm by its Fourier series. In fact, by applying the Weierstrass M-test, we can make the following statement with a stronger convergence guarantee:

If f is continuous and 2\pi-periodic and \sum_{n=-\infty}^{\infty} \left|\hat{f}(n)\right| is convergent, then its Fourier series \sum_{n=-\infty}^{\infty} \hat{f}(n) e_n converges uniformly to f.

Furthermore, the Fourier transform \hat{f} itself encodes many of the properties of the original function f. In particular, we note that the symmetry of f is preserved in its Fourier transform.

If f is even, then for every n \in \mathbb{Z}, \hat{f}(-n) = \hat{f}(n).
If f is odd, then for every n \in \mathbb{Z}, \hat{f}(-n) = -\hat{f}(n).

Now we have the machinery to compute \sum_{n=1}^{\infty} 1/n^2.

(Of Basel) Define f:[-\pi, \pi]\rightarrow \mathbb{R} to be f(x)=|x|. We compute the Fourier coefficient of f. First note that

    \[\hat{f}(0) = \frac{1}{2\pi}\int_{-\pi}^{\pi} |x| dx = \frac{\pi}{2}.\]

For n \neq 0 \in \mathbb{Z}, we have

    \[2\pi\hat{f}(n) = \int_{0}^{-\pi}xe^{-inx} dx + \int_{0}^{\pi}xe^{-inx} dx. \]

Using integration by parts, we have for every a, b \in \mathbb{R},

    \begin{eqnarray*} \int_a^b xe^{-inx}dx & = & \frac{x e^{-inx}}{-in}\Biggr|_a^b - \int_a^b \frac{e^{-inx}}{-in}dx \\ & = & \frac{x e^{-inx}}{-in}\Biggr|_a^b + \frac{e^{-inx}}{n^2}\Biggr|_a^b. \end{eqnarray*}

Then

    \begin{eqnarray*} 2\pi\hat{f}(n) & = & \frac{-\pi(-1)^n}{-in} + \frac{(-1)^n -1 }{n^2} + \frac{\pi(-1)^n}{-in} + \frac{(-1)^n -1 }{n^2} \\ & = & \frac{2((-1)^n - 1)}{n^2}.  \end{eqnarray*}

Thus,

    \[ \hat{f}(n) =  \begin{cases} \frac{-2}{\pi n^2} & \text{if $n$ is odd,} \\ 0 & \text{if $n$ is even and nonzero,} \\ \frac{\pi}{2} & \text{if $n=0$.}  \]

From Euler’s formula, the Fourier series of f can be written as

    \[  \hat{f}(0) + \sum_{n=1}^{\infty} \left[ (\hat{f}(n) + \hat{f}(-n)) \cos(nx) + i(\hat{f}(n) - \hat{f}(-n)) \sin(nx) \right]. \]

Since f is even, by Proposition, \hat{f}(-n)=\hat{f}(n). From our calculation, the Fourier series of f is explicitly

    \[  \frac{\pi}{2} + \sum_{n\geq1, n \text{ odd}} \frac{-4}{\pi n^2} \cos(nx). \]

Note that \sum_{n=1}^{\infty} 1/n^2 is convergent from the Cauchy condensation test. Thus, by the Convergence Theorem, we conclude that

    \[ \frac{\pi}{2} + \sum_{n\geq1, n \text{ odd}} \frac{-4}{\pi n^2} \cos(nx) \longrightarrow |x|, \]

where the convergence is uniform (and in particular, pointwise). Setting x=0, we see that

    \[ \sum_{n\geq1, n \text{ odd}} \frac{1}{n^2} = \frac{\pi^2}{8}. \]

Lastly, observe that

    \begin{eqnarray*} \sum_{n=1}^{\infty} \frac{1}{n^2} & = & \sum_{n\geq1, n \text{ even}} \frac{1}{n^2} + \sum_{n\geq1, n \text{ odd}} \frac{1}{n^2} \\ & = & \frac{1}{4} \sum_{n=1}^{\infty} \frac{1}{n^2} + \frac{\pi^2}{8}, \end{eqnarray*}

finishing the proof.

Auction algorithm for the assignment problem

The maximum bipartite matching is a well-known problem in graph theory, operations research, and economics. Also known by the name assignment problem, it models a marketplace with buyers and items, where every buyer has a valuation for each item, and we want to match (assign) each buyer to an item, with no item being shared. For social utility, an assignment that maximizes the total valuation is ideal.

In graph theory, the Hungarian algorithm by Kuhn produces a matching in polynomial time maximizing the total weight of the edges. Independently, Bertsekas from operations research and Demange, Gale, and Sotomayor from the economics perspective both use an underlying auction to solve the same problem. The auction provides additional intuition for this problem not transparent when using only graph-theoretic techniques. Here we describe a simple auction algorithm, largely following the presentation from this blog post.

To formulate this problem, we consider a bipartite graph B \times I, where B represents buyers and I represents items, with each edge (i,j) having an nonnegative weight v_{ij} representing the value Buyer i places on Item j.

A matching from B to I is a function M:B \rightarrow I such that for all i\neq j\in B, M(i)\neq M(j) or M(i)=M(j)=\emptyset.

The assignment problem is simply to find a matching function M such that its valuation \sum_{i\in B} v_{i,M(i)} is maximized. (Since it is possible that some buyers may not be assigned an item, we write v_{i,\emptyset}=0 for notational convenience.)

Below is the pseudocode of the auction algorithm. Intuitively, the algorithm is simulating a multi-item auction, where we put a price tag initialized to zero on each item. The queue Q represents the list of unmatched buyers, which is initialized to B. In each iteration, a buyer is greedily matched with his most valued item minus the price tag. However, each time an item is matched, it becomes slightly more expensive, becoming less valuable to other buyers in future rounds.

AUCTION(epsilon):
  Queue Q <- B 
  for j in I:
    p[j] <- 0
  
  BidirectionalMap M
  while !Q.isEmpty():
    i <- Q.pop()
    j <- Find j in I maximizing v[i][j] - p[j]

    if v[i][j]-p[j] >= 0:
      if M.ContainsValue(j):
        Q.push(M.KeyOf(j))
      M(i) <- j
      p[j] <- p[j] + epsilon

  Output M, p
Consider the case when |I|=1, where a group of buyers is being matched for a single item. Clearly whoever values the item the most will win the auction, but at what price? From the highlighted line in the above pseudocode, it is readily seen that a buyer drops out when the price is above his value. Thus, the final sale price is at most the second highest valuation among all buyers (plus O(\epsilon)) . Hence, the above algorithm can be seen as a generalization of a second-price auction for multiple items.

Complexity. Before analyzing the running time, we first make the observation that the algorithm must terminate in finite steps. The key insight is that a buyer will not overpay for an item. In particular, writing C = \max_{i,j} v_{i,j}, we have

AUCTION(\epsilon) terminates within (C/\epsilon + 1) |I| iterations.

To see this, fix an item j. Note that before each increment to p_j, the inequality p_j \leq v_{ij} \leq C must hold. So the final price must satisfy p_j \leq C + \epsilon. The number of iterations in the loop is at most the number of times a price is incremented, which is (C/\epsilon + 1) |I|.

A trivial implementation that finds the item with maximum value takes linear time, leading to an implementation with a running time of O(C|I|^2/\epsilon). A faster algorithm can be achieved by maintaining |B| priority queues of size |I|, one for each buyer, as follows: Each time the price of Item j is incremented, d_j priority queues are updated, where d_j is the number of buyers who are connected to Item j in the bipartite graph. Then the total number of priority queue updates is

    \[ \sum_{j \in I} d_j \mbox{(\# of increments to $p_j$)} \leq \sum_{j \in I} d_j (C / \epsilon + 1). \]

By the handshaking lemma, \sum_{j \in I} d_j = 2m, where m is the number of edges in the bipartite graph. Thus, a priority queue-based implementation takes time at most O(Cm\log|I|/\epsilon).

Now we analyze the valuation produced from the matching M. The proof is short and self-contained but utilizes \epsilon-complementary slackness, a concept related to primal-dual method in optimization.

Let M be the matching produced by AUCTION(\epsilon) and M' be any matching from B to I. Then

    \[\sum_{i\in B} v_{iM(i)} \geq \sum_{i\in B} v_{iM'(i)} - \epsilon |B|. \]

Since we can choose \epsilon to be arbitrarily small, the theorem implies that output matching is arbitrarily close to optimal. In particular, if all the values v_{ij} are binary, setting \epsilon < 1/|B| implies that the algorithm terminates with a maximal matching, and its time complexity is cubic on dense graphs, on par with the Hungarian algorithm.

We say that Buyer i is \epsilon-happy under M if he is matched with an item that is \epsilon-close to optimal, i.e., for all j\in I, v_{iM(i)} - p_{M(i)} \geq v_{ij} - p_j - \epsilon. For notational convenience, we will write v_{i\emptyset}=p_{\emptyset}=0, so a buyer being completely out-priced (being matched with an “empty” item) is also \epsilon-happy.

The key insight is that during each iteration, the algorithm preserves the loop invariant that all buyers in B \setminus Q are \epsilon-happy. To see this, note that the invariant holds vacuously true initially. During each iteration, whenever a buyer is removed from the queue, he is always matched with an item whose price (after the update) is \epsilon-close to being optimal. Then the previous holder of the item is added back to the queue. Since all buyers in B are \epsilon-happy when the algorithm terminates, we can observe that for any matching M',

    \[ \sum_{i\in B} \left(v_{iM(i)} - p_{M(i)}\right) \geq \sum_{i\in B} \left(v_{iM'(i)} - p_{M'(i)} - \epsilon\right). \]

Re-arranging, it is easy to see that

    \[ \sum_{i\in B} v_{iM(i)} \geq \sum_{i\in B} v_{iM'(i)} - \epsilon|B| + \sum_{i\in B}(p_{M(i)} - p_{M'(i)}). \]

Thus, it suffices to show that \sum_{i\in B}(p_{M(i)} - p_{M'(i)}) \geq 0. If Item j is in the range of M' but not M, then the algorithm must have never incremented its price, so p_j=0. Thus,

    \[ \sum_{i\in B}(p_{M(i)} - p_{M'(i)}) = \sum_{j \in Range(M)\setminus Range(M')} p_j, \]

which is nonnegative.

The theorem also implies that in any marketplace, there is a set of market-clearing prices, i.e., each buyer has at most one preferred item. If all the valuation are integers, then a variant of the above algorithm from the DGS paper produces a price vector that is pointwise minimal among all market-clearing prices.

Data analysis workflow

Motivation

In data analysis, a REPL (read-evaluate-print-loop) is great for asking questions quickly about a dataset. As opposed to writing code using a compiler, an interactive environment lets us hone in on the right questions, which is of great importance for exploratory work.

A typical REPL session — e.g., using R, Pig, Hive, Python — involves loading the dataset and performing a combination of filtering, projection, and transformation. Finding the appropriate combination may require many iterations, such as tweaking the various parameters used in filtering. The process of repeating almost the same commands can be quite tedious, especially dealing with wraparound lines.

Some potential workflows are:
1) Write script in a text editor. Execute it on the command line.
During development, lines are typically appended to an existing script. So each script execution has a significant overlap with previous executions. This can be a bottleneck when loading the input or a particular transformation takes time to process. Executing a script also makes it harder to try out various transformation on the fly.

2) Write script in a text editor. Then copy and paste commands as needed into a REPL.
While this allows many iteration, the physical act of copy-and-paste becomes the most dominant part of script writing, overshadowing the thought process. Furthermore, copy-and-paste from one window to another is somewhat challenging in a ssh session when working remotely.

3) Write script in a text editor. Execute the current line of the cursor in a REPL in a separate window.
This workflow allows the flexibility of experimental work. Certain IDEs, such as RStudio, already has this functionality. However, if one is looking for a uniform experience across various languages, a simple shell experience can be achieved with the vim+tmux+slime combination, described next.

vim + tmux + slime

The basic idea is to split a shell into two panes divided by a horizontal line, where vim is in the top pane and a REPL session at the bottom. A keyboard shortcut then sends the current line under the cursor in vim to the REPL. (Selecting a block of lines to be evaluated is also an option.)

1) tmux
tmux has many features. Here we just focus on our particular use case. Once installed, create a new session and split the current window into two panes:

tmux new -s session-name
tmux split-window

2) vim-slime
Originally an Emacs plugin, vim-slime is the tool that sends a block of lines in vim directly to another tmux pane. Once installed, adding these lines to .vimrc enables this powerful trick:

let mapleader="\"
map t :SlimuxREPLSendLine
vmap t :SlimuxREPLSendSelection

In this example, in vim’s normal mode, the keystroke “space” followed by “t” sends either the current line or the highlighted block of lines, if it exists, to another pane. (For the first usage, there will be a prompt that asks which pane should receive the signal from vim.)

Lastly, the vim-tmux-navigator plugin allows seamless navigation among vim split windows and tmux panes.

Here is an animated screenshot that sends three lines in vim to be evaluated in R:

data-analysis-workflow

SVD and low-rank approximation

The singular value decomposition (SVD) factorizes a matrix A=U\Sigma V^* where U,V are unitary and \Sigma is diagonal with nonnegative entries. Unlike the eigenvalue decomposition, every (real or complex) matrix is guaranteed to have a SVD. Geometrically, the SVD implies that every linear transformation is equivalent to a rotation, a scaling, then another rotation.

In data analysis, the SVD can be computed to perform principal component analysis (PCA), where an ellipsoid is used to “fit” a data matrix. In other words, once the SVD is computed, a partial SVD, A_k=\sum_{i=1}^{k} \sigma_i u_i v_i^* can be used to approximate the matrix, where u_i and v_i are the left and right singular vectors, and \sigma_i are the singular values in decreasing order. Equivalently, a partial SVD is obtained by replacing the smaller singular values with zero in the factorization. Intuitively, with the length of each ellipsoid’s component equal to its corresponding singular value, keeping only the components with non-trivial length makes the ellipsoid a good fit to the data. Formally, the Eckart-Young-Mirsky Theorem states that a partial SVD provides the best approximation to A among all low-rank matrices.

Let A and B be any m\times n matrices, with B having rank at most k. Then
\| A-A_k \|_2 \leq \| A-B \|_2,
\| A-A_k \|_F \leq \| A-B \|_F.

The theorem has a short proof for either norm, but here we describe a slightly longer proof using spectral method. Before proceeding, since we will work with singular values of different matrices, we will write \sigma_i(\cdot) to denote the i-th singular value of a matrix. We first state the following lemma, which provides an estimate on how much a low-rank perturbation to a matrix can affect its singular values.

Let A and B be any m\times n matrices, with B having rank at most k. \sigma_{i+k}(A) \leq \sigma_i(A-B).

The theorem is then an easy application of the above lemma.

(Of theorem) Recall that the \ell_2 norm of a matrix is equal to its largest singular value. So

    \[ \| A - A_k \|_2 = \sigma_1(A-A_k) = \sigma_{k+1}(A),\]

which by the lemma (setting i=1) is at most \sigma_1(A-B)=\| A-B \|_2.

For Frobenius norm, recall that the square of the Frobenius norm of a matrix is equal to the sum of the squares of its singular values. Applying the lemma above, we have

    \begin{eqnarray*} \| A-A_k \|_F^2 & = & \sum_{i=1}^{n} \sigma_i(A-A_k)^2 \\ & = & \sum_{i=1}^{n-k} \sigma_{i+k}(A)^2 \\ & \leq & \sum_{i=1}^{n}(A-B)^2 \\ & = & \| A - B \|_F^2, \end{eqnarray*}

finishing the proof.

The lemma is almost a re-statement of one of Weyl’s inequalities:

    \[ \sigma_{i+j-1}(X+Y) \leq \sigma_i(X) + \sigma_j(Y). \]

To see this, let X=A-B, Y=B, and j=k+1, and note that \sigma_{k+1}(B) is zero. Instead of invoking this inequality directly, we will prove the lemma from first principle.

(Of lemma) We first prove the case when i=1, i.e., \sigma_{k+1}(A) \leq \sigma_1(A-B).
Let V be the vector space spanned by the top k+1 right singular vectors v_1,\ldots,v_{k+1}, and W be the null space of B. By dimension argument, there must exist a non-zero vector w in V \cap W. Then bounding from above, we have

    \[ \|Aw\|_2 = \|(A-B)w\|_2 \leq \sigma_1(A-B) \|w\|_2. \]

Bounding from below and using the facts that w is orthogonal to v_i when i > k+1, Pythagorean Theorem, decreasing magnitude of singular values, and w lies in the orthonormal set \{v_1,\ldots,v_k\}, respectively, we obtain

    \begin{eqnarray*} \norm{Aw}_2^2 & = & \norm{\sum_{i=1}^{k+1} \sigma_i u_i \langle v_i, w\rangle}_2^2 \\ & = & \sum_{i=1}^{k+1} \sigma_i^2 \langle v_i, w \rangle ^2 \\ & \geq & \sigma_{k+1}^2 \sum_{i=1}^{k+1} \langle v_i, w \rangle ^2 \\ & = & \sigma_{k+1}^2 \|w\|_2^2. \end{eqnarray*}

Combining both sides yields \sigma_{k+1}(A) \leq \sigma_1(A-B).

Now we prove the general case. Using the i=1 case, we know that for any matrix C with rank at most i+k-1,

    \[ \sigma_{i+k}(A) \leq \sigma_1(A-C). \]

Writing C=C'+C'', we can write A-C = (A - B - C') + (B - C''). Since \sigma_1(X+Y) \leq \sigma_1(X) + \sigma(Y) (expressing \sigma_1 as the inner product \langle v, (X+Y)u \rangle and applying Cauchy-Schwarz, where u and v are the top left and right singular vectors of X+Y, respectively) , we have that

    \[ \sigma_{i+k}(A) \leq \sigma_1(A-B-C') + \sigma_1(B-C''). \]

Since the above inequality holds for arbitrary C with rank at most i+k-1, we can choose C'=(A-B)_{i-1} and C''=B_k. Then we can conclude that

    \[ \sigma_{i+k}(A) \leq \sigma_{i}(A-B) + \sigma_{k+1}(B), \]

completing the proof since \sigma_{k+1}(B) is zero.