Testing whether one mean is higher than another

Tom McCurdy · Apr 30, 2011

There was a problem that was talked about in class where we had the amount of quantity sold in one column and the promotion level in another column. The promotion took values between 0 and 0.88 with a number of values being zero.

The problem discussed was to test the following hypothesis
The average quantity sold by Bob when there is a promotion (p>0) is significantly higher than when there is no promotion.

My professor claimed the problem should be solved using regression. He had the independent column as the promotion value and the dependent column as the quantity sold. Then his plan was to use the regression results in excel to test whether or not the slope was equal to zero.

Now the biggest mistake I see right away is that he would need to categorize it into a discrete system where you have group 1) promotion, and group 2) no promotion. The second problem I see is then what arbitrary value do you give the group 1) of promotion

Now it's been awhile since I have taken a statistics class but from what I remember when you are doing hypothesis testing for two sample means and you have
[tex]H_0 : \mu_{promo} = \mu_{no-promo}[/tex]
[tex]H_1 : \mu_{promo} > \mu_{no-promo}[/tex]

Now the sample sizes are different
the promo category had 201 samples
the non promo category had 17 samples

Now you would need to decide if you could consider the population variances to be equal.
If they were equal you would test the means in one fashion, and if they weren't equal you had to test the means in another fashion... which was a substantial amount of work.

Is my professor right... can you just bypass this all by simply putting the numbers into two categories and doing a linear regression and checking the p-value for their slope?

Mark44 · Apr 12, 2019

For the professor's approach, which is essentially a graphics approach, but with Excel doing the calculations, it's not clear to me how you would plot the points representing the different samples. For the 17 samples where there was no promotion, you would have points scattered along the vertical axis.

If by "promotion" you mean something like "10% price drop" and "20% price drop," the graph would have many points scattered along vertical lines at the locations on the horizontal axis for the various price drops. I suppose you could find the line of best fit (i.e., use a regression line), but you'd want to take the calculated slope of the line with a grain of salt if the correlation coefficient R was large.

Testing whether one mean is higher than another

Graduate Expected numbers of cards of a last color remaining

Graduate Probability puzzle

Undergrad The problem of points

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect