← Back to Works

Benford's Law and Ergodic Theory

Why does the digit 1 lead numbers 30% of the time in real life?

UT Math DRP Symposium, April 24, 2025Textbook: Nillsen, Randomness and Recurrence in Dynamical SystemsAbout the DRP →
— views

Abstract: I presented a proof sketch of Benford's Law using dynamical systems: mapping leading digits to mantissas via log10,\log_{10}, then applying Weyl equidistribution for irrational circle rotations (Kronecker systems). I also studied recurrence (Poincaré/Kac), normal numbers, and ergodicity as the shared framework behind “random-looking” digit behavior.

The Phenomenon

Benford's Law appears in tax returns, stock prices, physical constants, population data—anywhere numbers span multiple orders of magnitude. Benford's Law states that the probability that the leading digit is dd follows P(d)=log10(1+1d),P(d) = \log_{10}\left(1 + \frac{1}{d}\right),giving 30.1% for 1 and just 4.6% for 9.

Why It Happens

Benford's Law appears so universally because of how leading digits relate to logarithms. Consider the sequence 2n2^n as one example of many. Any real number can be written as 10km10^k \cdot m where m[1,10)m \in [1, 10) is the mantissa—the part that determines the leading digit. Since 2=10log10(2),2 = 10^{\log_{10}(2)}, we have 2n=10nlog10(2).2^n = 10^{n \log_{10}(2)}.

Write nlog10(2)=k+{nlog10(2)}n \log_{10}(2) = k + \{n \log_{10}(2)\} where k=nlog10(2)k = \lfloor n \log_{10}(2) \rfloor and {x}=xx\{x\} = x - \lfloor x \rfloor is the fractional part. Then 2n=10k10{nlog10(2)},2^n = 10^k \cdot 10^{\{n \log_{10}(2)\}},so m=10{nlog10(2)}m = 10^{\{n \log_{10}(2)\}}. The leading digit is dd iff m[d,d+1),m \in [d, d+1), i.e. {nlog10(2)}[log10(d),log10(d+1))\{n \log_{10}(2)\} \in [\log_{10}(d), \log_{10}(d+1)).

This sequence of fractional parts is a Kronecker system: rotation on [0,1)[0,1) by the irrational angle log10(2)\log_{10}(2). Weyl's Equidistribution Theorem says such systems are uniformly distributed. Since the orbit visits intervals in proportion to their length, P(leading digit=d)=log10(d+1)log10(d)=log10(1+1d),P(\text{leading digit} = d) = \log_{10}(d+1) - \log_{10}(d) = \log_{10}\left(1 + \frac{1}{d}\right),which is exactly Benford's Law. The same argument applies to any multiplicatively growing sequence, which is why Benford's Law appears in such diverse contexts.

Intuitively: going from leading digit 1 to 2 requires doubling, while 8 to 9 is only a 12.5% increase. In real life, multiplicative processes spend more time with smaller leading digits.

Scale Invariance

The Weyl approach shows why powers of 2 follow Benford's Law from a mathematical standpoint, but the concept of scale invariance provides another perspective: since units are arbitrary, a natural leading-digit law should not depend on whether we measure in miles or kilometers. Benford's Law is the unique scale-invariant distribution. On a log scale, multiplying by a constant just adds a constant; invariance under these shifts corresponds to a uniform distribution of fractional parts of log10(x),\log_{10}(x), which produces Benford's formula.

Beyond Benford's Law, I also studied foundational results in dynamical systems and ergodic theory.

Recurrence and Waiting Times

The Kronecker system above is one example of a dynamical system—a model of how states evolve over time. A natural question: how often do these systems revisit particular states? Poincaré's recurrence theorem: for a measure-preserving transformation on a finite-measure space (e.g., a length-preserving map on a bounded interval), almost every point in a set UU eventually returns to UU. The strengthened version says almost every point is recurrent, returning arbitrarily close to its starting position infinitely often.

Kac's theorem intuitively quantifies the average return time: if UU is a region within a space S,S, the expected number of steps to return to UU is μ(S)/μ(U),\mu(S)/\mu(U), the ratio of total measure to the region's measure. If UU occupies 10% of the space, you return on average every 10 iterations. Smaller regions take longer to revisit.

Randomness and Normal Numbers

Benford's Law tells us first digits aren't uniformly distributed across datasets. But what about all digits within a single number? When do those look “random”? Informally, a number is normal (in a given base) if its digits look statistically uniform at every level: single digits, pairs, triples, and longer blocks all occur in the proportions you'd expect from random digits. Borel's Normal Numbers Theorem: almost all real numbers are normal. (This is a loose definition; the formal treatment involves measure theory and is worth exploring if you're curious.)

Important distinction: normality is about digit distribution within a single number's base-b expansion, while Benford's Law is about leading digits across a dataset. A simple example is that 2/32/3 is not normal (its expansion is eventually periodic in any integer base). Take the binary representation: 2/3=0.1022/3 = 0.\overline{10}_2. 0s and 1s appear equally often, but at the level of length-2 blocks, only 10 and 01 appear; 00 and 11 never occur, whereas in a normal number you'd expect all pairs to occur with equal frequency of 1/41/4. Thus it looks “normal” for single digits but fails the normality properties once you examine pairs. Both Benford and normality ask “what does randomness look like?” in different contexts.

The Bigger Picture

These seemingly distinct results share a common foundation in ergodic theory. A system is ergodic if a single trajectory, followed long enough, eventually explores the entire space proportionally. Birkhoff's Ergodic Theorem formalizes this: in ergodic systems, time averages equal space averages. Track one orbit over time, and the fraction of time it spends in any region converges to that region's measure.

Borel's theorem, Weyl's theorem, and Benford's Law are all instances of this principle. The digit 1 leads about 30% of the time because the corresponding interval on [0,1)[0,1) has measure log10(2)log10(1)0.301\log_{10}(2) - \log_{10}(1) \approx 0.301.

Topics Covered

  • Benford's Law and scale invariance
  • Kronecker Systems and Weyl's Equidistribution Theorem
  • Recurrence and return times (Poincaré, Kac)
  • Borel's Normal Numbers Theorem and digit “randomness”
  • Birkhoff's Ergodic Theorem and time vs. space averages

Weekly Notes