Can a Beta Distribution Model Scores in the Interval [0,1] for Ranked Retrieval?

In summary: Comments appreciated.In summary, this paper discusses how to create a probability distribution for a rank given a score or non-relevance. It provides two distributions and then uses them to project a curve of precision and recall. Comments are appreciated.
  • #1
TheOldHag
44
3
I have a set of scored items with the scores in the interval [0,1]. Roughly speaking the distribution of scores is about 50% equal to 0 and then sloping steeply downward all the way toward one or near to one. I want to fit this data to a distribution and use that down the road in some calculation but I'm not sure how to proceed.

My guess is that since the data lay in the interval [0,1] it can be modeled as a beta distribution. So now I need to find the parameters alpha and beta. Is it easy as calculating the sample mean and the sample variance and working backwards from the equations for the mean and variance of a beta distribution or does that only work for normal distributions? Since these are sample do they approximate a normal distribution so that I should be fitting a normal distribution to the data despite the interval [0,1] (it would have very thin tails)? Comments appreciated.
 
Physics news on Phys.org
  • #2
I'm sorry you are not generating any responses at the moment. Is there any additional information you can share with us? Any new findings?
 
  • #3
I think this contains what I'm looking for but have not dug in yet since this problem has been set aside temporarily.

http://dare.uva.nl/document/125861

The general issue I'm having here surrounds ranked retrieval. I have rankings and they do work and I can present them in descending order to the user so that they can see more relevant items first. But it is useful for a variety of other applications to know what is the probability of relevance given a score (or non-relevance). This paper here seems to construct two distributions for the score given relevant and given non-relevant and then goes from there. Another thing I can do with this is project a possible curve of precision and recall as the user proceeds through the items in ranked order.
 

What is a Beta Distribution?

A Beta distribution is a probability distribution that is commonly used to model random variables that have values between 0 and 1, such as proportions or probabilities. It is a continuous distribution, meaning that it can take on any value within its specified range.

What are the parameters of a Beta Distribution?

The two parameters of a Beta distribution are alpha and beta. These parameters determine the shape of the distribution and can take on any positive value. Alpha is often referred to as the "success" parameter, while beta is referred to as the "failure" parameter.

What is the relationship between a Beta Distribution and a Binomial Distribution?

A Binomial distribution is used to model the number of successes in a fixed number of trials, while a Beta distribution is used to model the probability of success in a single trial. The two distributions are related because the Beta distribution can be used as a prior distribution for the probability of success in a Binomial distribution.

How do you fit a Beta Distribution to data?

To fit a Beta distribution to data, you will need to estimate the alpha and beta parameters based on the data. This can be done using maximum likelihood estimation or Bayesian inference. There are also software packages available that can automatically fit a Beta distribution to data.

What are some common applications of the Beta Distribution?

The Beta distribution is commonly used in Bayesian statistics, where it is used as a prior distribution for the probability of success in a Binomial or Bernoulli distribution. It is also used in quality control, reliability analysis, and in the social sciences to model attitudes and opinions. Additionally, the Beta distribution is used in finance to model stock returns and in marketing to model market share.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
464
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
721
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
958
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • General Math
Replies
1
Views
1K
  • Poll
  • Science and Math Textbooks
Replies
1
Views
2K
Back
Top