Discrete power law distributions

In summary, the conversation discusses the issue of gaps in data when analyzing degree distributions for finite graphs. The first question addresses how to handle gaps at the high end of the degree scale and throughout the data. The second question mentions a proposed method for calculating the power law exponent, but raises concerns about its applicability to graphs with an exponent less than 2. The third question considers a simplistic approach to estimating the power law exponent, but raises concerns about potential risks. Suggestions for alternative distributions and estimation methods are also mentioned.
  • #1
Old Guy
103
1
I'm not a mathematician, but I want to understand how a mathematician would view this issue.

I'm working primarily with degree distributions for finite graphs, and when I make a log log plot of the frequency distribution the data points form a nice straight line (at least for low degree values). To be specific, the x-axis is the degree (number of edges per vertex) and the y-axis is the number of vertices that have a particular degree.

Question 1: What is the convention for dealing with gaps in the data? This has two parts. First, it is obvious that there will be gaps at the high end of the degree scale. How is this dealt with when, for example, trying to find the power law exponent? What about the case where there are gaps throughout the data, such as the case where the vertices are constrained to only an even number of vertices?

Question 2: A. Clauset, C. Shalizi, and M. Newman, SIAM Rev. 51, 661 (2009) discusses calculation of the exponent for power law distributions, and propose a MLE method and KS test for goodness of fit. It seems to me that this won't work for graphs where the exponent is less than 2 because the zeta function blows up. Can anyone suggest an alternative? Or is there a mathematical basis for saying that these distributions MUST be something other than a power law?

Question 3: A simplistic approach would be to base the power law exponent on the data that fits it. This would essentially ignore the zero values and (potentially) some of the high degree tail of the distribution. On the other hand, it would be (I think) a useful description of the behavior of the bulk of the system. What are the risks here?

Thanks!
 
Physics news on Phys.org
  • #2
The gaps between the extreme values aren't really a problem for most estimation methods.

Also as an alternative to Zeta there's the Zipf distribution which is similar but has a finite maximum, which would solve the blowup problem. MLE could be used for the fit but it tends to set the upper bound to the largest data point, so perhaps MVUEs could be used instead (though I don't know the details - lookup the German Tank Problem for an example).
 

1. What is a discrete power law distribution?

A discrete power law distribution is a statistical distribution that follows a power law, meaning that the probability of an event occurring is proportional to a power of its size or frequency. In other words, there are a few very common events and many rare events, with a gradual decrease in frequency as the event size increases.

2. What are some real-world examples of discrete power law distributions?

One common example is the distribution of city sizes, where there are a few very large cities and many small towns. Other examples include the frequency of words in a language, the number of citations received by scientific papers, and the sizes of earthquakes.

3. How is a discrete power law distribution different from a normal distribution?

A normal distribution, also known as a bell curve, follows a different pattern where most events occur around the mean and decrease in frequency as they get farther away from the mean. A discrete power law distribution, on the other hand, has a long tail of rare events that can extend far from the most common events.

4. What are the implications of a discrete power law distribution in data analysis?

Discrete power law distributions can significantly impact data analysis, as they can skew the results and make it difficult to accurately predict future events. It is important to identify and account for these distributions when analyzing data, especially in fields such as economics, social sciences, and technology.

5. Are there any limitations to using a discrete power law distribution?

While a discrete power law distribution can accurately model certain real-world phenomena, it may not be suitable for all situations. In some cases, a different distribution may be a better fit for the data. It is important to carefully consider the characteristics of the data before using a discrete power law distribution in analysis.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
493
  • Set Theory, Logic, Probability, Statistics
Replies
25
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
Replies
17
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
3K
  • Introductory Physics Homework Help
Replies
5
Views
1K
  • Atomic and Condensed Matter
Replies
2
Views
4K
Back
Top