Imagine if we sample numbers from the unit interval and count the occurrences of each digit in every number’s infinite decimal representation. For instance, in the number , every non-zero digit appears exactly once, but appears infinitely often (trailing zeros). On the other hand, for the number , the digit occurrences (after the “dot”) is equal for every digit. As the number of samples increases, one may expect that the number of occurrences for each digit in to be close to one another. In fact, the Normal Number Theorem, first formulated by Borel, not only formalizes this intuition but goes a step further: for almost every number in , each digit appears equally likely. Below we discuss this theorem in detail.

We first recall a few basic facts about the representation of a number. Rather than working entirely in base , we work with an arbitrary base .

In other words, is the integer component of and the fractional component. With , our familiar decimal representation, we know that every real number has a decimal representation, and by identifying as , the representation is unique. Analogously, for every , every real number has a base- representation as well.

Note that in the definition above, , the digits in the integer component of , are completely ignored. The reason is that for every , its integer component is bounded, so in the above definition is fixed and does not affect the limiting value of digit frequencies.

We now state the Normal Number Theorem formally:

Let us first consider a truncated version of the theorem. Consider tossing a fair coin independently times. From a combinatorial argument, most sequences of coin tosses have about heads and tails. By interpreting a head as and a tail as , most -bit numbers are normal. The passage from bits to infinite bits is the crux of the theorem, since the combinatorial process of selecting a random -bit number no longer extends. The proof illustrates how a measure-theoretic abstraction can circumvent this difficulty. In particular, we will apply the following “-trick” repeatedly.

Since was arbitrary, must equal .

First consider the interval , where . By the previous proposition, it suffices to show that the set of non-normal numbers in is null. Now fix . By the proposition again, it suffices to show that the set of numbers in that are not normal with respect to is null. Now fix . By the proposition again, it suffices to show that the set of numbers in that are not normal with respect to base and digit is null. By definition, for , is normal with respect to base and digit iff is normal with respect to base and digit . Thus, without loss of generality, we may further assume that .

Now consider the probability space (, where is the set of Borel sets on , and the Lebesgue (probability) measure. For , define to be , where is the -th digit of the base expansion .

As defined, is an indicator function, and for , the set is just a union of intervals:

which is a Borel set, implying that is a Bernoulli random variable with .

To show independence, it suffices to show that for every , the family of random variables is independent. This follows directly from the observation that (since the interval has length ) and the previous observation that .

Given our setup, we now apply the Strong Law of Large Numbers to conclude: there exists a null set so that for every , converges to , i.e., is normal with respect to base and digit .

As a corollary, we conclude that for every continuous random variable, its output is most likely normal.

By the Dominated Convergence Theorem,

finishing the proof.