Given a sample of points how could I determine what distribution does

  • Context: Undergrad 
  • Thread starter Thread starter porums
  • Start date Start date
  • Tags Tags
    Distribution Points
Click For Summary
SUMMARY

To determine the distribution of a sample of points, first classify the data as either discrete or continuous. The process involves fitting a probability distribution by computing statistics such as mean and variance, and ruling out unsuitable distributions. Essential techniques include graphing the data using boxplots, histograms, and probability plots to assess characteristics like outliers and skewness. Large sample sizes may be necessary to accurately identify whether the data aligns with distributions such as normal or t-distribution.

PREREQUISITES
  • Understanding of discrete and continuous data types
  • Familiarity with probability distribution fitting techniques
  • Ability to compute basic statistics (mean, variance)
  • Experience with data visualization tools (e.g., boxplots, histograms)
NEXT STEPS
  • Learn about fitting probability distributions using statistical software (e.g., R or Python's SciPy library)
  • Explore the use of boxplots and histograms for data visualization
  • Study the characteristics of normal and t-distributions
  • Investigate the Central Limit Theorem and its implications for sample size
USEFUL FOR

Data analysts, statisticians, and researchers seeking to understand data distribution and improve their statistical analysis skills.

porums
Messages
27
Reaction score
0
Given a sample of points how could I determine what distribution does it belong to ?

Any help please for newbie?
 
Physics news on Phys.org


First step is for you to decide if it's a discrete or cont. ones. Then you want to look into "fitting a probability distribution". There's various ways to do that, one of them is to compute few statistics and see if you notice anything i.e. Mean = Variance, etc.
 


It is more a process of ruling out the distributions your sample is not from. Start by graphing the data - boxplot, as a guide for outliers, skewness or symmetry, histogram (for a large enough sample), probability plots, etc.
Be aware that it can take very large sample sizes to determine whether (as an example) your data is better described by a normal distribution, a t-distribution, or some other symmetric distribution.
with what source of data are you working?
 

Similar threads

  • · Replies 31 ·
2
Replies
31
Views
5K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K