One [power] law to rule them all?

Power-law distributions seem to be everywhere, and not just in word-counts and whale whistles. Most people know that Vilfredo Pareto found them in the distribution of wealth, two or three decades before Udny Yule showed that stochastic processes like those in evolution lead to such distributions, and George Kingsley Zipf found his eponymous law in word frequencies. Since then, power-law distributions have been found all over the place — Wikipedia lists

… the sizes of craters on the moon and of solar flares, the foraging pattern of various species, the sizes of activity patterns of neuronal populations, the frequencies of words in most languages, frequencies of family names, the species richness in clades of organisms, the sizes of power outages, criminal charges per convict, volcanic eruptions, human judgements of stimulus intensity …

My personal favorite is the noises it makes when you crumple something up, as discussed by Eric Kramer and Alexander Lobkovsky, "Universal Power Law in the Noise from a Crumpled Elastic Sheet", 1995 ) referenced in "Zipf and the general theory of wrinkling", 11/15/2003).

Contradicting the Central Limit Theorem's implications for what is "normal", power law distributions seem to be everywhere you look.

Or maybe not?

Many of the alleged "power-law" examples are actually log-normal, or some other heavy-tailed distribution, according to a paper by Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman, "Power-law distributions in empirical data" (SIAM Review 2009). As an alternative to the paper, you can read Cosma's blog post "So You Think You Have a Power Law — Well Isn't That Special?", 6/15/2007; or this summary of the results in "Cozy Catastrophes", 2/15/2012:

In our paper, we looked at 24 quantities which people claimed showed power law distributions. Of these, there were seven cases where we could flat-out reject a power law, without even having to consider an alternative, because the departures of the actual distribution from even the best-fitting power law was much too large to be explained away as fluctuations. (One of the wonderful thing about a stochastic model is that it tells you how big its own errors should be.) In contrast, there was only one data set where we could rule out the log-normal distribution. […]

We found exactly one case where the statistical evidence for the power-law was "good", meaning that "the power law is a good fit and that none of the alternatives considered is plausible", which was Zipf's law of word frequency distributions. We were of course aware that when people claim there are power laws, they usually only mean that the tail follows a power law. This is why all these comparisons were about how well the different distributions fit the tail, excluding the body of the data. We even selected where "the tail" begins to maximize the fit to a power law for each case. Even so, there was just this one case where the data compelling support a power law tail.

Links to Cosma's other posts on the topic can be found in "Power laws and other heavy-tailed distributions", and a recent discussion can be found in "Power laws", 5/30/2018.

It's worth noting that there are many random processes that can be shown mathematically to produce power-law distributions, at least in the tails — as Cosma puts it,

there turn out to be nine and sixty ways of constructing power laws, and every single one of them is right, in that it does indeed produce a power law. Power laws turn out to result from a kind of central limit theorem for multiplicative growth processes, […].

So in a way it's surprising that so many of the power-law claims turn out to be bogus or at least doubtful.

And what about the claims of power-law distributions in the vocalizations of whales, dolphins, and other animals? I'm not sure. But the fact that the key review paper doesn't list Clauset et al. among its hundreds of references, and that none of the relevant papers seems to apply the tests described in Clauset et al., or to offer links to their underlying data, makes me suspicious. (If any readers know of papers that apply the needed tests, or offer datasets suitable for checking the claims, please let me know.)

And as a footnote: The nature of processes that generate power-law distributions was the topic of what might be the most unpleasant debate in the history of mathematical modeling. This battle took place between 1955 and 1961, and the combatants were Herbert Simon and Benoit Mandelbrot. See "The long tail of religious studies" for links and details.

Update — see also Heathcote, Brown, and Mewhort, "The power law repealed: The case for an exponential law of practice", Psychonomic Bulletin & Review 2000:

The power function is treated as the law relating response time to practice trials. However, the evidence for a power law is flawed, because it is based on averaged data. We report a survey that assessed the form of the practice function for individual learners and learning conditions in paradigms that have shaped theories of skill acquisition. We fit power and exponential functions to 40 sets of data representing 7,910 learning series from 475 subjects in 24 experiments. The exponential function fit better than the power function in all the unaveraged data sets. Averaging produced a bias in favor of the power function.

And also Michael Ramscar, "Source codes in human communication", preprint 3/22/2019.

June 2, 2019 @ 6:50 am · Filed by Mark Liberman under Computational linguistics

Permalink

from Hacker News http://bit.ly/YV9WJO
via IFTTT