Discrete power law distributions

Click For Summary
SUMMARY

The discussion centers on the analysis of discrete power law distributions, particularly in the context of degree distributions for finite graphs. Key points include the handling of gaps in data when calculating power law exponents, referencing the work of Clauset, Shalizi, and Newman (2009) on maximum likelihood estimation (MLE) and the Kolmogorov-Smirnov (KS) test. The conversation highlights the limitations of these methods for exponents less than 2 due to issues with the zeta function and suggests alternatives such as the Zipf distribution and minimum variance unbiased estimators (MVUEs) for better fitting. The risks of ignoring zero values and high degree tails in power law fitting are also discussed.

PREREQUISITES
  • Understanding of degree distributions in graph theory
  • Familiarity with power law distributions and their properties
  • Knowledge of maximum likelihood estimation (MLE) and goodness of fit tests
  • Basic concepts of statistical distributions, including the Zipf distribution
NEXT STEPS
  • Research the application of the Kolmogorov-Smirnov (KS) test for goodness of fit in power law distributions
  • Explore the implications of the Zipf distribution in statistical modeling
  • Investigate minimum variance unbiased estimators (MVUEs) and their applications in fitting distributions
  • Study the German Tank Problem and its relevance to statistical estimation techniques
USEFUL FOR

Mathematicians, data scientists, and researchers working with graph theory, particularly those analyzing degree distributions and power law behaviors in complex networks.

Old Guy
Messages
101
Reaction score
1
I'm not a mathematician, but I want to understand how a mathematician would view this issue.

I'm working primarily with degree distributions for finite graphs, and when I make a log log plot of the frequency distribution the data points form a nice straight line (at least for low degree values). To be specific, the x-axis is the degree (number of edges per vertex) and the y-axis is the number of vertices that have a particular degree.

Question 1: What is the convention for dealing with gaps in the data? This has two parts. First, it is obvious that there will be gaps at the high end of the degree scale. How is this dealt with when, for example, trying to find the power law exponent? What about the case where there are gaps throughout the data, such as the case where the vertices are constrained to only an even number of vertices?

Question 2: A. Clauset, C. Shalizi, and M. Newman, SIAM Rev. 51, 661 (2009) discusses calculation of the exponent for power law distributions, and propose a MLE method and KS test for goodness of fit. It seems to me that this won't work for graphs where the exponent is less than 2 because the zeta function blows up. Can anyone suggest an alternative? Or is there a mathematical basis for saying that these distributions MUST be something other than a power law?

Question 3: A simplistic approach would be to base the power law exponent on the data that fits it. This would essentially ignore the zero values and (potentially) some of the high degree tail of the distribution. On the other hand, it would be (I think) a useful description of the behavior of the bulk of the system. What are the risks here?

Thanks!
 
Physics news on Phys.org
The gaps between the extreme values aren't really a problem for most estimation methods.

Also as an alternative to Zeta there's the Zipf distribution which is similar but has a finite maximum, which would solve the blowup problem. MLE could be used for the fit but it tends to set the upper bound to the largest data point, so perhaps MVUEs could be used instead (though I don't know the details - lookup the German Tank Problem for an example).
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 25 ·
Replies
25
Views
6K
Replies
3
Views
3K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 21 ·
Replies
21
Views
4K
  • · Replies 17 ·
Replies
17
Views
6K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 12 ·
Replies
12
Views
5K