Hypothesis or claim about a mean or proportion

  • Thread starter Thread starter all4one
  • Start date Start date
  • Tags Tags
    Mean
AI Thread Summary
To test the claim that the average price of a new college textbook is $102.44, data collection is essential, and a diverse sample of textbooks should be gathered. A sample of at least 50 to 100 textbooks is recommended, ensuring that selection criteria are random and not price-based to avoid bias. Prices can be sourced from various platforms, including local bookstores and online retailers like Amazon. It's important to consider lurking variables that may affect textbook prices, such as publication date and the distribution of textbook types, to ensure accurate comparisons. Ultimately, a well-structured sample will allow for a reliable assessment of the claim regarding textbook pricing.
all4one
Messages
5
Reaction score
0
I have a project for my statistic class.

This is the claim "The website, www.campusbooks.com, claims, that the average price of a new college textbook is $102.44"

Where do begin to test about this claim?I have the entire purpose for this project too if you need more information to help me
 
Last edited:
Physics news on Phys.org
Let's see... To test any hypothesis, you'd need some data. So first and foremost there is a data collection issue.
 
EnumaElish said:
Let's see... To test any hypothesis, you'd need some data. So first and foremost there is a data collection issue.

My bad, the website is campusbooks.com not campusbook.com

What I was thinking was using the Null(H0), and Alternative (H1)

But what my approach should be to get the data? Should I go get 50 sample of textbooks and get the mean price of it? But that way It cannot be that accurate, or should I get 50 sample for particular textbook (like only math books)
 
You should collect a sample of all kinds of textbooks that the website claims to have averaged. If the website said "the average price of Math books" then you should restrict your sample to Math books, but otherwise you shouldn't.
 
EnumaElish said:
You should collect a sample of all kinds of textbooks that the website claims to have averaged. If the website said "the average price of Math books" then you should restrict your sample to Math books, but otherwise you shouldn't.

So basically I can collect all prices from different textbooks. But it does not necessarily from the website; I can get all the prices from my school bookstore (the point to see whether it's a false claim or not)

But the problem is some new textbooks are cheaper than others. How big my sample should be? 50 books 100 books? Say if my 50 samples are all low prices, then it is obvious that the mean going to be lower or vice versa.
 
Yes, you can get prices from your school bookstore. That would be the "local approach."

A more general approach could be, say, to go to the Amazon.com website (or the barnesandnoble.com website) and look for textbook prices there. Since there are going to be a lot of textbooks, you should think about a "rule" as to which books to look for; but it shouldn't be a price-based rule. For example, "I will make a list of the 100 lowest-priced textbooks that I can find on Amazon.com" cannot be the rule because then your sample prices will have a downward bias.

An good rule should sample as many subject areas as possible and make a selection using random criteria. Examples of random criteria are:
1. include a textbook if its ISBN is divisible by 3 (that is, if the sum of its digits is divisible by 3, which is the same thing)
2. include a textbook if the last name of any of its authors begins with the letter M.
 
Last edited:
But the problem is some new textbooks are cheaper than others. How big my sample should be? 50 books 100 books? Say if my 50 samples are all low prices, then it is obvious that the mean going to be lower or vice versa.
50 is good, 100 is even better. You should be okay as long as your selection criteria does not depend on price.

Suppose you sample your local bookstore and the average you calculate is lower than the price posted on the website. If the difference between the two numbers is large, then you can state with some confidence that the price of the "average textbook" in your local bookstore is lower than the price posted on the website.

But if the difference is small, then you cannot be so confident that your bookstore's average is truly different from the price posted on the website. It may be identical, or just a little different.
 
Last edited:
Your advice are very helpful...
One last question. About "Lurking variables"

By deffinition lurkin variables is one that affects the variables being studied, but is not included in the study.

How should I attempted to control lurking variables in the data collection process?
 
Your test variable is the price of a textbook. Your test methodology is to collect a sample and then compare the sample average against the posted price. Your sample average may be biased if there are variables that affect the construction of the sample which also happen to affect the prices in the sample in a systemic way. I will give you 2 examples, you can and should find more examples and try to take these into account. Even if you cannot account for them, you should still mention them in your report.

Example: Suppose that the average textbook price increases with time. If the price on the website is "dated" relative to your sample then your sample will have an upward bias.

Example: Suppose that graduate texts are more expensive than undergrad texts, but for every graduate textbook published and sold there are five undergrad books published or sold. If your sample consists of grad and undergrad texts in equal numbers, then your sample will have an upward bias relative to the average textbook published or sold. You can generalize this example by thinking about potential price differentials between different groups of textbooks, e.g. by subject area (say, medical vs. technical), rather than grad vs. undergrad, which is just one example.
 
  • #10
To control against the first example, you can try to include in your sample only the books that were published before the date at which the posted price was calculated -- if you know when it was calculated, even approximately.

To control against the 2nd example, you can weight each price by the number of grad vs. undergrad (or med vs tech) students in your school, your county, or your state. If the ratio of med students to tech students is 2 to 1, then each medical textbook in your sample should be counted twice. You can also use the actual number of each textbook sold as a weight, if you can find those sale numbers.

Controlling for these differences can get out of hand fast. My advice is to create a sample and calculate their unweighted average; you can worry about weighting at a later stage.
 

Similar threads

Back
Top