# Statistical Significance Question

• phuntyme
In summary, a sample of 200 people were shown 5 concepts in a randomized order to avoid bias. Each concept was seen in the first position by 40 people. The researcher wants to compare rating scores from concept to concept and is unsure if a standard t-test would suffice. Some experts in market research have suggested that weaker concepts viewed in the last position tend to receive lower scores, while stronger concepts tend to receive higher scores when viewed in the last position. The researcher's question is which significance test to use to compare each concept to each other. Some suggestions include chi-square and nonparametric methods. Another option is to use ANOVA, but this assumes independent and normal variables. It is also important to check the distribution of responses on
phuntyme
I'm working on a problem where I have a sample of 200 people. I have a total of 5 concepts which are viewed by each respondent, but are rotated so there is "no" bias due to viewing order. Obviously each concept does not have the opportunity to be in each of the 5 possible viewing positions equally, but each concept is placed in the first position an equal number of times (i.e. each concept is seen in the first position by 40 people).

Now, I want to compare rating scores from concept to concept.

I'm not sure that a standard t-test would do the trick here.

People I know in the market research industry have said that when a single respondent views multiple concepts, weaker concepts viewed in the last position (5th in this case) wind up with lower scores than if they were in the first position. Further, stronger concepts wind up with higher scores if viewed in the last position than if they were viewed in the first position. I would think this affects variance in some way even though the concepts are viewed in a randomized order from respondent to respondent.

So finally my question: Which test for significance would I use to test each concept against each other (or just one of them compared to another one)?
These are all proportions such as 30% rated Definitely Would Buy (top box of a 5-point scale).

Last edited:
Chi square? If people expressed no particular tendency toward anyone concept, a uniform distribution of 200 subjects over 5 slots (with 40 subjects in each) could be expected. That would be your null hypothesis.

maybe you could use nonparametric methods?

EnumaElish said:
Chi square? If people expressed no particular tendency toward anyone concept, a uniform distribution of 200 subjects over 5 slots (with 40 subjects in each) could be expected. That would be your null hypothesis.

I agree with that. If there is no preference you would expect the sample to be statistically no different that equal proportions. So you test that using chi square like this. You have 5 slots let each be $s_i, i = 1,2,3,4,5$ your null hypothesis is no preference so the expectation is .2 for each. The chi-square statistic is

$$\chi^2 = \sum_{i=1}^5\frac{(s_i - 200(0.2))^2}{200(0.2)}$$

If [tex]\chi^2>\chi^2_{\alpha, 4}[/itex] then reject the null hypothesis of no effect. At least one of the slots is preferred.

Your social science friends bring up a good point though. People make value based judgements about things that are sometimes strongly situational. In fact if one item makes them feel upset they may not respond in an unbiased way to other items. I don't know how to help you with that one except to say you might try clustering the data on the five points to see if you get strong clusters, then deal with each cluster by itself. You would not need to know why they cluster but you are simply hypothesizing again that people with similar biases will respond about the same way.

Last edited:
Also, if you had 240 subjects, you could try each and every of the 120 (=5!) permutations exactly twice, giving the 5 concepts an equal chance (1/120) of appearing in a particular order. Obviously this is a design, not a testing, problem.

Strictly speaking, you should use a 1-way ANOVA - analysis of variance, which tests for equality of response patterns as the null hypothesis.

Your strategy of rotating "ratings" is a good thing to do. Note that if you are doing a computer generated survey, then totally random placement of items is not hard to do.(There's a huge literature on statistical problems in survey research.)

Make a plot -- always make a plot. For a crude estimate. you can use a t-test(binary proportion test) for the top against each of the others. If that's dicey, you'd best use ANOVA, SAS is an excellent system with which to conduct such an analysis.

And, don't forget common sense.
Regards,
Reilly Atkinson

(Look up Paired Comparison tests used with Analysis of Variance to get a bit more useful detail. )

The chi-sqaure is better than ANOVA for categorical data. Besides ANOVA assumes iid and normal. The chi-square is really just a goodness-of-fit test.
t-tests assume continuous variables and normal distributions. His data is ordinal on five categories.

Let R - {1,2,3,4,5} be the response and C - {1,2,3,4,5} be the concept. If the concepts are though of as treatments he could do ANOVA as long as they are idependent. That is the key and there are tests for independence that cna be done. The easiest way to get a sense is to look at the correlation matrix. If it is close to the identity matrix you probably have idependent variables. But that is only part, he needs also to look at the distribution of responses on each concept and test for normality. That test is done using a chi-square stastitic, but since his data is discrete an assumption of normality may be wildly unjustified unless he has 30 responses or more on each concept so that the CLT will apply. If those conditions were met I might do ANOVA in SAS. The reason I say that is his response could be thought of as a continuous variable. It is probably better to use chi-square throughout. Of course we know you can transform some datasets to be normal enough etc. He would not be asking the question if he was a statistician

I'm hoping he is still watching the thread. If you are please confirm the shape of your data set. i am assuming you have this

Concept: Response1, Response2, Response3 , ..., Responsek
-------------------------------------------------------------------
1 : 2, 4, 5, 2, 3,...
2: 3,4,5,5,1,2,...
3: 1,2,1,1,2,3,...
4: 3,5,1,2,4,...
5: 2,2,2,3,4,5,4,2,3,1,1,2,3,5,5...

The ANOVA set up would let the concepts be your treatments or factor groups. Your response variables have the same range and could be though of as continuous so then just use a chi-square to see if for instance her is the histogram for concept five as I have listed it so far

x x x x x x
x x|o|x|x|x|x
x x|o|x|x|x|x
x x|o|o|x|o|x
x o|o|o|o|o|x
x o|o|o|o|o|x
------------
1|2|3|4|5

The o's. You may not be able to feel good about that being normal for reasons that will not be apparent unless you have studied statistics. But anyways these are among the reasons why chi-square set's itself appart in this case. Besides that many textbooks would also recommend it. But if you see that you are getting something that looks normal or you get a very large number of responses and the dsitributions are not too skewed the CLT will work.

The regression model that is equivalent to ANOVA would be

Response = b[0] + b[1]C[1]+b[2]C[2]+b[3]C[3]+b[4]C[4]+b[5]C[5] + e

where e is the error term and C[1] is the first concept. The data has to be coded here so that a response on the first concept of 4 becomes the data vector [4:1,0,0,0,0]. Those with regression training know what I mean. So the data sub-matrix for concept four and my sample data above would be

[3:0,0,0,1,0]
[5:0,0,0,1,0]
[1:0,0,0,1,0]
[2:0,0,0,1,0]
[4:0,0,0,1,0]

The null hypothesis for ANOVA is

H[0]: b[1] = b[2] = b[3] = b[4] = b[5] = 0

This implies that

Response = b[0] + e.

Essentially the null hyopthesis means that there is no statistical influence from any of the five categories at all. The response you observed is simply normally distributed about the mean of all responses b[0] over all concepts (the grand mean.)

F is the value that will tell you whether you can reject the null hypothesis [Thanks EnumaElish, R^2 comes in after a significant F.]. Doing ANOVA this way allows you to use the regression analysis to due tests for significance for each coefficient b[k]. ie to do the test H[0]: b[k] = 0, H[a]: b[k] >< 0. SAS can do all of this with just the use of a few options in the PROC GLM or PROC REG functions.

So I guess I do partly agree with Reilly. You might be able to deal with it using ANOVA or linear regression or chi-square.

Last edited:
As a follow-up to Playdo, you can also reject the null if the model F statistic is significant at the 10% level (less stringent) or the 5% level (more stringent). The output from proc reg in SAS will display this F value and/or its significance ("p") level, along with the R^2 measure for the regression. I guess it cannot hurt to use both the Chi-sq. and the regression (ANOVA) analyses -- if the two results agree, you can be twice as confident.

Last edited:

## What is statistical significance?

Statistical significance refers to the likelihood that the results of a study or experiment are not due to chance. It is a measure of the strength of the relationship between variables and is often used to determine if the results of a study can be generalized to a larger population.

## How is statistical significance calculated?

Statistical significance is typically calculated using a p-value, which represents the probability of obtaining results as extreme or more extreme than the observed results if there was no true relationship between variables. Generally, a p-value of 0.05 or lower is considered statistically significant.

## What is the difference between statistical significance and practical significance?

Statistical significance is a measure of the strength of the relationship between variables, while practical significance refers to the real-world importance or relevance of the relationship. A result may be statistically significant, but not practically significant if the effect size is very small.

## Why is statistical significance important in research?

Statistical significance is important in research because it helps to determine if the results of a study are reliable and can be generalized to a larger population. It also allows researchers to make conclusions about cause and effect relationships between variables.

## What are some limitations of statistical significance?

While statistical significance is an important measure in research, it is not the only factor to consider. Other limitations include the possibility of Type I and Type II errors, the use of p-values as a cutoff for significance, and the need to interpret results in the context of the research question and study design.

• Set Theory, Logic, Probability, Statistics
Replies
22
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
3
Views
324
• Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
• Programming and Computer Science
Replies
2
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
10
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
21
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
2
Views
964
• Set Theory, Logic, Probability, Statistics
Replies
26
Views
3K