Is this a valid method for selecting a Simple Random Sample?

  • Context: MHB 
  • Thread starter Thread starter Ackbach
  • Start date Start date
  • Tags Tags
    Random
Click For Summary

Discussion Overview

The discussion revolves around the validity of a method for selecting a Simple Random Sample (SRS) of apartment complexes based on a quiz question involving the use of a table of random digits. Participants analyze the sampling process described, questioning whether it adheres to the principles of random sampling and whether certain samples may be favored over others.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant questions whether the student's method inadvertently introduces bias, suggesting that certain samples may be less likely than others due to the arithmetic involved.
  • Another participant argues that the method does not constitute simple random sampling because repeating the sampling process on the same data yields the same result, thus violating the principle that each element should have an equal probability of selection.
  • Some participants propose that using different rows of the random number table could mitigate the issue, but they remain concerned about the arithmetic process used by the student.
  • A participant suggests testing the method by coding the algorithm and analyzing the results to identify any patterns that may indicate bias.
  • One participant shares their experience of simulating the sampling method using a spreadsheet, reporting a non-uniform distribution in the results, which they interpret as evidence that the method does not yield a Simple Random Sample.

Areas of Agreement / Disagreement

Participants generally disagree on the validity of the sampling method, with some asserting it is biased while others suggest it could be acceptable under certain conditions. The discussion remains unresolved regarding whether the method can be considered a valid SRS.

Contextual Notes

Participants express concerns about the arithmetic steps taken in the sampling process, noting that certain operations may restrict the range of possible outcomes. There is also mention of the need for a uniform distribution in a valid sampling method, which is not observed in the simulated results.

Ackbach
Gold Member
MHB
Messages
4,148
Reaction score
94
So, I have a jokester (MHB user Cmoney) in my class (what teacher doesn't?), who decided to go all-out on a quiz question. The question reads as follows:

You are planning a report on apartment living in a college town. You decide to select three apartment complexes at random for in-depth interviews with residents.

(a) Explain how you would use a line of Table D to choose an SRS (Simple Random Sample) of 3 complexes from the list below. Explain your method clearly enough for a classmate to obtain your results.

(b) Use line 117 to select the sample. Show how you use each of the digits.

Now Table D is a table of random digits as follows:

\begin{array}{cllllllll}
{\bf Line} \\
116 &14459 &26056 &31424 &80371 &65103 &62253 &50490 &61181 \\
117 &38167 &98532 &62183 &70632 &23417 &26185 &41448 &75532 \\
118 &73190 &32533 &04470 &29669 &84407 &90785 &65956 &86382
\end{array}

The apartment complex listing has 33 names in it - that's all that's really important.

For part (a), my student's answer is as follows:

First, I would obtain the second digit of every group in lines 116-118 (4,6,1,0,5,2,0,1,8,8,2,0,3,6,1,5,3,2,4,9,4,0,5,6). Second, split them into pairs: (46,10,52,01,88,20,36,15,32,49,40,56). Third, out of 33 apartments, labeled 1-33, take the first pair and last and subtract, then take the next two and subtract and so forth until you get three. (10,30,12). Fourth, the ones that were chosen were: (and he gives the three apartment complexes).

My question: is this truly an SRS, or did he inadvertently introduce a process that makes certain samples less likely than others (for example, is some intermediate number restricted to be smaller than a certain amount)?

For part (b), my student's answer is as follows:

Line 117: (38,16,79,85,32,62,18,37,06,32,23,41,72,61,85,41,44,87,55,32).
Then add each one [edit: it looks as though he did it digit-wise]: (11,7,16,13,5,8,9,10,6,5,5,5,9,7,13,5,8,15,10,5).
Subtract with the one to the right: (4,3,3,1,1,0,2,8,7,5).
Add: (7,4,1,10,12).
Subtract: (3,9,12)
Add: (12,12)
Add: 24, which is a particular apartment.

He stops here, so he doesn't attain the full sample of three complexes. I know there are steps here which are suspect - the very first one has a max of 18. And are each of the possible samples equally likely?

Thanks!
 
Physics news on Phys.org
I would not call this simple random sampling for the simple reason that if you repeat the sampling process on the same data you will get the exact same apartment each time. That would not occur with a proper random sample. If the goal is to take one element from a list of size N, then each element should have the probability 1/N of being selected. Using your student's method, once the list is made, one particular element has a 100% chance of being selected and the others have 0% no matter how many times the sampling is repeated.
 
Well, I think the idea is that if you did the sampling again, you'd use a different row of the table of random numbers. I'm not worried about the table of random numbers. If you like, imagine those numbers to have come from a pseudo-random-number-generator. I'm worried about all the arithmetic and (what I would call) shenanigans that my student is doing. Is the arithmetic he's using inadvertently making some samples less likely than others?
 
I think I get your question, but just want to point out that even if you won't get the same data each round, the fact that repeating the process on the same data always gives the same answer is bad. It's the complete opposite of random.

You are asking, I think, if his process is somehow inherently biased over some other similar method that would not be biased. I just don't think this is a standard way of sampling but nevertheless - one way to test that would be to notice a pattern, but I'm lazier than that and would test it by coding the algorithm and running it on a huge number of 5 digit numbers to see what I get.

That's all I have to weigh in on. Maybe someone else can quickly spot a pattern.
 
Wow, seems like you have a genius on your hands there, Ackbach.

I personally would create a spreadsheet, then run the data in a histogram. Try that out, and see if you get the results that you are looking for.

Good luck
 
Well, I constructed a LibreOffice Calc spreadsheet to simulate this method of sampling. I did a histogram of the resulting numbers (over 200 of them), and there was a definite pattern. The five-number summary was {3, 14.5, 20, 24, 41}. The mean was 20.1, and the standard deviation was 7.7. The histogram was unimodal and symmetric, with a definite peak near 21. There were no outliers or gaps.

Perhaps the most important feature lacking: the histogram was by no means flat, as you'd expect from a uniform distribution. Therefore, I conclude that this sampling method would not produce a Simple Random Sample.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
3K
Replies
4
Views
4K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 67 ·
3
Replies
67
Views
16K
  • · Replies 125 ·
5
Replies
125
Views
20K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 11 ·
Replies
11
Views
33K
  • · Replies 128 ·
5
Replies
128
Views
34K