1. The World According to Friedrich Gauss
We seem heavily inclined to apply Gaussian (or normal) distributions when making sense of various phenomena around us. But Gaussian laws are not the only statistical models that exist, so why do we freely favour them?
A conjectured reason might be their intuitiveness and reassurance; normal distributions seem to convey a sense of stability and predictability that helps us cope with future uncertainty. Let’s use a classic example to illustrate the idea.
Imagine we wish to examine the properties of a random variable, say the average height of individuals in a group. We observe the following:
None of the above applies to power laws, which makes phenomena governed by those laws hard to work with (like predicting their mean and variance), at least in the traditional ways.
Phenomena governed by normal distributions are safe. The future resembles the past and can be reasonably predicted from historical data (provided that the randomness-generating mechanisms are static or slow-moving). However, the phenomena of interest to us, such as the behaviour of social networks like markets, economies, or communities, are complex. The same applies to biological (and most natural) systems.
Most natural and social phenomena exhibit complex behaviour and are governed by power laws; therefore, understanding power laws is vital to making sense of and acting in those environments.
So, what are power laws and power law dynamics, and how do they differ from Gaussian?
2. Gaussian (Normal) Distributions
The Gaussian distribution, also known as the normal distribution, is a different mathematical model used to describe data where values cluster around a central mean value with a symmetrical bell-shaped curve. In a Gaussian distribution, most data points are close to the mean, and as you move further away from the mean in either direction, the number of data points decreases.
This distribution is commonly encountered in natural phenomena like the heights of people, scores on standardized tests, and measurement errors. Mathematically, it is represented as:
The Gaussian distribution has the familiar bell curve that we see below.
The figure above shows a normal zero mean and unit standard deviation distribution. The following can be inferred:
3. Power Law Distribution
A power law distribution is a mathematical model describing phenomena where a few events have high frequencies while many events have low ones. In a power law distribution, the probability of an event occurring is inversely proportional to its magnitude.
Mathematically, a power law distribution can be represented as:
Power laws have special properties, which we shall explore later. For now, let’s focus closely on how P(X > k) changes with k. Let’s assume that P(X > k) = 1/n. The following is true:
What is interesting to observe here is that, while in the Gaussian, the ratio of P(X > 2k) / P(X > k) = is constantly and rapidly decreasing, for power laws, it is constant! In our example above, this means that P(X > 2k) / P(X > k) = P(X > 4k) / P(X > 2k) = … = 0.5. Aside from being constant, this ratio also does not depend on k.
Both of these properties make the implications of power laws fascinating, as we will endeavour to show in the next sections.
4. Examples of Power Laws
4.1 A Historical Overview
Today, power laws continue to be a subject of study and fascination in many fields. They provide insights into the underlying principles governing the distribution of phenomena in complex systems, and their applications range from linguistics and economics to physics and network theory. The study of power laws has also expanded to address more complex distributions beyond simple scaling laws, such as fat-tailed distributions and multifractals.
4.2. Growth of Cities (Urbanization)
Power laws provide a framework for understanding the population distribution in cities. One of the most famous applications of power laws in this context is Zipf’s law.
Zipf’s law states that a city’s population is inversely proportional to its rank in the population hierarchy in many large cities. In simpler terms, the second-largest city will have approximately half the population of the largest city, the third-largest city will have approximately one-third, and so on.
Mathematically, Zipf’s law can be expressed as:
Zipf’s law suggests that a few cities will have the highest populations while most cities will have smaller populations. This concept helps explain the unequal distribution of urban populations and has implications for urban planning, resource allocation, and infrastructure development.
4.3 Sandpile Crashes (Self-Organized Criticality)
Sandpile crashes, also known as self-organized criticality, are observed in systems where particles are added one by one until they reach a critical state and trigger a cascade of events, often leading to avalanches. This concept is a manifestation of power law behaviour.
In sandpile models, grains of sand are added to a pile, and when the pile reaches a certain height or angle, it becomes unstable and collapses. Interestingly, the size and frequency of these collapses follow a power law distribution. This means that small collapses occur frequently, medium-sized ones less often, and very large collapses (rare but significant) also occur.
Mathematically, the distribution of avalanche sizes can be described by a power law:
Sandpile models illustrate how complex systems can exhibit self-organized criticality, leading to power law behaviour.
5. Scale Invariance and Power Laws
5.1 The Pareto Principle Revisited
Power laws are a manifestation of scale invariance, which means that a system’s behaviour remains similar or self-similar regardless of the scale at which you observe it. This property is often found in natural systems due to many processes’ hierarchical and self-organizing nature.
To understand scale invariance in power laws, consider the below diagram. What it’s telling us is that if we take any partition at any level and closely inspect its properties, we would find that it has the same structure as its parents and children (up to the smallest scale).
For example, Vilfredo Pareto found that the land distribution of the people of Italy followed a simple law (later known as the Pareto Principles): 80% of the land was owned by 20% of the people. Scale invariance tells us that, of the 20% bucket, we would find that 80% of the lands would be owned by 20% of the people in the category.
5.3 Why Big Cities Get Bigger — Explaining Power Law Dynamics
Why do big cities get bigger, rich people richer, and popular books more popular? Let’s try to answer that with an example.
Power law dynamics explain the accumulation of power, wealth, and status in societies, economies, and ecologies. They also explain the 80/20 rule.
The 80/20 rule (sometimes also called the “Pareto Principle” or “Law of the Vital Few”) is quite famous, stating that a minority of employees generate most of the business value, a minority of people control the majority of wealth, few large cities exist alongside many small ones, and the list goes on.
6. Final Words
Statistics is mainly concerned with understanding the properties of a population from a random sample drawn from that population. This includes estimating the mean, variance, and higher moments and fitting a distribution law to the sample data. These properties allow us to understand the population and estimate event probabilities.
Below is a list of where things can go wrong with power laws.
Deriving statistical inferences from power law distributions can be more challenging than doing so from Gaussian (normal) distributions for several reasons:
- Nassim Taleb, Fooled by Randomness
- Nassim Taleb, The Black Swan
- Geoffrey West, Scale: The Universal Laws of Growth, Innovation, Sustainability, and the Pace of Life in Organisms, Cities, Economies, and Companies
- Roger Lewin, Complexity: Life at the Edge of Chaos
- Ilya Pergogine, Isabelle Stengers, Order out of Chaos, Man’s New Dialogue with Nature