I How to correct (adjust) Amazon ratings based on the number of reviews

  • I
  • Thread starter Thread starter Wes Turner
  • Start date Start date
AI Thread Summary
The discussion focuses on finding a method to adjust Amazon ratings based on the number of reviews, emphasizing that a high rating from many reviews is more reliable than a perfect score from few. A Bayesian approach is suggested for assessing the likelihood of a rating being above a certain threshold, though the user seeks a practical formula for implementation. The user experimented with calculating confidence intervals using both T-distribution and N-distribution functions in Excel to derive adjusted ratings. They present data comparing the results from these two methods and inquire about the appropriateness of different confidence levels. The conversation aims to refine the approach to accurately reflect product quality based on review quantity and rating.
Wes Turner
Messages
68
Reaction score
15
I would like a way to compare Amazon ratings as a function of the number of reviews. I think it's pretty clear that a product with a 4.9 rating based of 5,000 reviews is likely better than one with a 5.0 rating based on just 1 review. But how can I calculate an adjusted rating for a set of products
such as the ones in this table of ratings and number of reviews.

1716078636487.png


Thanks
 
Physics news on Phys.org
Well, you could set a threshold and use a Bayesian approach to assess the posterior probability that the rating is higher than the threshold
 
I don't know what that is or how to do it. Is there a formula I can put in the Adjusted Rating column?

I was thinking about calculating a confidence interval. Here's my first try at that. I calculated the confidence interval using the Excel T-distribution confidence interval function, then subtracted that from the rating.

I then did the same thing using the Excel N-distribution confidence interval function and subtracted that from the rating.

Here's the data for a 95% confidence interval. The last column shows the difference in the rankings from using an N-distribution function vs a T-distribution function.

1716085458200.png


Any comments? Is there a better way? Should I use a different confidence interval (97%? 99%?)?

Thanks
 
Back
Top