Interpreting Normal Distribution Logic for Data Analysis

In summary, the person is looking for mathematical arguments to support the hypothesis that their data is normally distributed. They mention building a histogram that has a similar shape to a normal distribution, and ask for other methods to confirm this hypothesis before performing hypothesis testing. They also mention a process of bus arrivals and ask for advice on mathematical arguments to support the normal distribution hypothesis. Suggestions for tests of normality are provided, including evaluating skewness and kurtosis and using specific tests such as Kullbach-Leiber distance and Kolmogorov-Smirnoff.
  • #1
Mark J.
81
0
Hi.
I need some arguments to move forward the hypothesis that my data are normally distributed.
Except fact that I build histogram and it is similar of shape as normal distribution how can I find some other math arguments (I mean interpreting the logic of normal distribution) to argue about this hypothesis before beginning hypothesis testing with different criteria and tests?

Regards
 
Physics news on Phys.org
  • #2
One common explanation for a phenomena producing a normal distribution is that it the sum (meaning literally an arithmetic sum) of many independent random random variables.

A hazier philosophical version of this is reasoning is that a phenomena that results from the combined effect of many small and independent random causes will have a normal distribution. (This argument is too vague to be tested mathematically, but you haven't made it clear what kind of "justification" for a normal distribution you want.)
 
  • #3
I have a process of bus arrivals.
While taking inter-arrival times between 2 following buses as random variables I build histogram and it has shape of normal distribution but meanwhile it is similar to log-normal ,etc.
Now I am in search of some math arguments (theory)to check in order to follow the hypothesis of normal distribution.
If you can advice me on that pls
 
  • #4
Mark J. said:
I have a process of bus arrivals.
While taking inter-arrival times between 2 following buses as random variables I build histogram and it has shape of normal distribution but meanwhile it is similar to log-normal ,etc.
Now I am in search of some math arguments (theory)to check in order to follow the hypothesis of normal distribution.
If you can advice me on that pls

There are a number of tests of normality. You can directly evaluate the third and fourth moments (skewness and kurtosis) which both should be close to 0. In addition there are specific tests which you can look up: Kullbach-Leiber distance, Kolmogorov-Smirnoff (adaption), Agostino's K squared test, Anderson-Darling, Shapiro-Wilks and tests in SPSS and other statistical software.

http://webspace.ship.edu/pgmarr/Geo441/Examples/Normality Tests.pdf
 
Last edited:
  • #5
,

I understand the importance of ensuring that our data is normally distributed before conducting any hypothesis testing. A normal distribution is a bell-shaped curve that represents the distribution of a continuous variable. It is characterized by its mean, median, and mode being equal, and the majority of the data falling within one standard deviation of the mean.

One way to further support your hypothesis that your data is normally distributed is by calculating the skewness and kurtosis of your data. Skewness measures the symmetry of the data distribution, while kurtosis measures the peakedness of the distribution. In a normal distribution, the skewness should be close to 0 and the kurtosis should be close to 3.

Another important aspect to consider is the Central Limit Theorem. This theorem states that as the sample size increases, the distribution of sample means will approach a normal distribution, regardless of the shape of the population distribution. Therefore, if your sample size is large enough, it is likely that your data will follow a normal distribution.

Additionally, you can use statistical tests such as the Kolmogorov-Smirnov test or the Shapiro-Wilk test to formally assess the normality of your data. These tests compare your data to a theoretical normal distribution and provide a p-value to indicate the likelihood of your data being normally distributed.

In conclusion, there are several mathematical arguments and tests that can be used to support your hypothesis that your data is normally distributed. It is important to thoroughly examine your data and use a combination of approaches to ensure the validity of your results. Good luck with your analysis!
 

Related to Interpreting Normal Distribution Logic for Data Analysis

What is a normal distribution?

A normal distribution is a statistical concept that represents the distribution of data points in a bell-shaped curve. The majority of data points cluster around the mean, with fewer points towards the edges of the curve.

How can I tell if my data follows a normal distribution?

There are a few ways to determine if your data follows a normal distribution. One way is to visually inspect the data using a histogram or a Q-Q plot. You can also use statistical tests such as the Shapiro-Wilk test or the Kolmogorov-Smirnov test.

Why is it important to understand normal distribution in data analysis?

Understanding normal distribution is important because many statistical models and tests assume that the data follows a normal distribution. By understanding the characteristics of a normal distribution, you can make more accurate interpretations and predictions from your data.

What can cause a data set to deviate from a normal distribution?

There are several factors that can cause a data set to deviate from a normal distribution, including outliers, skewed data, and small sample sizes. It's also possible that the underlying population does not follow a normal distribution.

Are there any alternatives to using normal distribution in data analysis?

Yes, there are alternative distributions that can be used in data analysis, such as the binomial distribution, Poisson distribution, and exponential distribution. These distributions are often used when the data does not follow a normal distribution. It's important to choose the appropriate distribution based on the characteristics of your data.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
527
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • General Math
Replies
6
Views
805
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • STEM Academic Advising
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
4K
Back
Top