MHB Is this a valid method for selecting a Simple Random Sample?

  • Thread starter Thread starter Ackbach
  • Start date Start date
  • Tags Tags
    Random
AI Thread Summary
The discussion centers on the validity of a student's method for selecting a Simple Random Sample (SRS) of apartment complexes using a table of random digits. The student’s approach introduces arithmetic steps that may bias the selection process, resulting in certain samples being less likely than others. It is argued that a proper SRS should ensure each element has an equal probability of selection, which the student's method fails to achieve. A simulation using a spreadsheet revealed a non-uniform distribution in the results, further confirming the method's inadequacy for true random sampling. Ultimately, the consensus is that the student's method does not constitute a valid SRS.
Ackbach
Gold Member
MHB
Messages
4,148
Reaction score
93
So, I have a jokester (MHB user Cmoney) in my class (what teacher doesn't?), who decided to go all-out on a quiz question. The question reads as follows:

You are planning a report on apartment living in a college town. You decide to select three apartment complexes at random for in-depth interviews with residents.

(a) Explain how you would use a line of Table D to choose an SRS (Simple Random Sample) of 3 complexes from the list below. Explain your method clearly enough for a classmate to obtain your results.

(b) Use line 117 to select the sample. Show how you use each of the digits.

Now Table D is a table of random digits as follows:

\begin{array}{cllllllll}
{\bf Line} \\
116 &14459 &26056 &31424 &80371 &65103 &62253 &50490 &61181 \\
117 &38167 &98532 &62183 &70632 &23417 &26185 &41448 &75532 \\
118 &73190 &32533 &04470 &29669 &84407 &90785 &65956 &86382
\end{array}

The apartment complex listing has 33 names in it - that's all that's really important.

For part (a), my student's answer is as follows:

First, I would obtain the second digit of every group in lines 116-118 (4,6,1,0,5,2,0,1,8,8,2,0,3,6,1,5,3,2,4,9,4,0,5,6). Second, split them into pairs: (46,10,52,01,88,20,36,15,32,49,40,56). Third, out of 33 apartments, labeled 1-33, take the first pair and last and subtract, then take the next two and subtract and so forth until you get three. (10,30,12). Fourth, the ones that were chosen were: (and he gives the three apartment complexes).

My question: is this truly an SRS, or did he inadvertently introduce a process that makes certain samples less likely than others (for example, is some intermediate number restricted to be smaller than a certain amount)?

For part (b), my student's answer is as follows:

Line 117: (38,16,79,85,32,62,18,37,06,32,23,41,72,61,85,41,44,87,55,32).
Then add each one [edit: it looks as though he did it digit-wise]: (11,7,16,13,5,8,9,10,6,5,5,5,9,7,13,5,8,15,10,5).
Subtract with the one to the right: (4,3,3,1,1,0,2,8,7,5).
Add: (7,4,1,10,12).
Subtract: (3,9,12)
Add: (12,12)
Add: 24, which is a particular apartment.

He stops here, so he doesn't attain the full sample of three complexes. I know there are steps here which are suspect - the very first one has a max of 18. And are each of the possible samples equally likely?

Thanks!
 
Mathematics news on Phys.org
I would not call this simple random sampling for the simple reason that if you repeat the sampling process on the same data you will get the exact same apartment each time. That would not occur with a proper random sample. If the goal is to take one element from a list of size N, then each element should have the probability 1/N of being selected. Using your student's method, once the list is made, one particular element has a 100% chance of being selected and the others have 0% no matter how many times the sampling is repeated.
 
Well, I think the idea is that if you did the sampling again, you'd use a different row of the table of random numbers. I'm not worried about the table of random numbers. If you like, imagine those numbers to have come from a pseudo-random-number-generator. I'm worried about all the arithmetic and (what I would call) shenanigans that my student is doing. Is the arithmetic he's using inadvertently making some samples less likely than others?
 
I think I get your question, but just want to point out that even if you won't get the same data each round, the fact that repeating the process on the same data always gives the same answer is bad. It's the complete opposite of random.

You are asking, I think, if his process is somehow inherently biased over some other similar method that would not be biased. I just don't think this is a standard way of sampling but nevertheless - one way to test that would be to notice a pattern, but I'm lazier than that and would test it by coding the algorithm and running it on a huge number of 5 digit numbers to see what I get.

That's all I have to weigh in on. Maybe someone else can quickly spot a pattern.
 
Wow, seems like you have a genius on your hands there, Ackbach.

I personally would create a spreadsheet, then run the data in a histogram. Try that out, and see if you get the results that you are looking for.

Good luck
 
Well, I constructed a LibreOffice Calc spreadsheet to simulate this method of sampling. I did a histogram of the resulting numbers (over 200 of them), and there was a definite pattern. The five-number summary was {3, 14.5, 20, 24, 41}. The mean was 20.1, and the standard deviation was 7.7. The histogram was unimodal and symmetric, with a definite peak near 21. There were no outliers or gaps.

Perhaps the most important feature lacking: the histogram was by no means flat, as you'd expect from a uniform distribution. Therefore, I conclude that this sampling method would not produce a Simple Random Sample.
 
Suppose ,instead of the usual x,y coordinate system with an I basis vector along the x -axis and a corresponding j basis vector along the y-axis we instead have a different pair of basis vectors ,call them e and f along their respective axes. I have seen that this is an important subject in maths My question is what physical applications does such a model apply to? I am asking here because I have devoted quite a lot of time in the past to understanding convectors and the dual...
Fermat's Last Theorem has long been one of the most famous mathematical problems, and is now one of the most famous theorems. It simply states that the equation $$ a^n+b^n=c^n $$ has no solutions with positive integers if ##n>2.## It was named after Pierre de Fermat (1607-1665). The problem itself stems from the book Arithmetica by Diophantus of Alexandria. It gained popularity because Fermat noted in his copy "Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...

Similar threads

Replies
13
Views
3K
2
Replies
67
Views
14K
Replies
7
Views
3K
Replies
7
Views
4K
Replies
2
Views
3K
Replies
2
Views
4K
Replies
1
Views
3K
Back
Top