Survey with dynamic question selection

  • Context: Undergrad 
  • Thread starter Thread starter GiTS
  • Start date Start date
  • Tags Tags
    Dynamic Survey
Click For Summary

Discussion Overview

The discussion revolves around designing a survey that dynamically selects questions based on respondents' preferences for notebook designs. Participants explore methods to calculate the probability of respondents liking specific notebook designs based on previous answers and overall trends from earlier respondents.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant proposes using a probability model to determine how likely a respondent is to like a notebook design based on their answers and the responses of others.
  • Another participant suggests that correlation regression may be necessary to analyze the data effectively, indicating a need to recall the relevant equations.
  • A participant simplifies the problem by using a smaller set of questions about food preferences and presents statistical results from 100 completed surveys to illustrate how to adjust probabilities for subsequent respondents.
  • Further adjustments to the probabilities are discussed, showing how the likelihood of liking certain items changes based on previous answers.
  • One participant mentions consulting a statistics professor and indicates a preference for using ANOVA or regression analysis for their survey design, while expressing concerns about using PHP for hosting.

Areas of Agreement / Disagreement

Participants express various approaches to the problem, with no consensus on a single method or model for calculating probabilities. Multiple competing views on statistical techniques remain present throughout the discussion.

Contextual Notes

Participants mention the need for specific statistical methods, such as least squares and regression analysis, but do not resolve the mathematical steps or assumptions underlying their approaches.

GiTS
Messages
132
Reaction score
0
I am designing a survey to gather data on consumer preferences for notebooks (school club charity project). The survey will contain only 10 questions about what notebook designs they prefer. Thus, I only want them to see notebooks they are likely to want.

I am not sure how to set up the equation. I want to find the probability a person will like a notebook based on their responses and the responses of others. Sort of like how amazon has “people who bought this product also liked”.

So if I have 40 different designs of notebooks, I want to give each design a chance to appear on the survey, but weighted by the probability the person will like it.

The notebooks the very first respondent sees are completely random. But the next respondent sees what they are likely to like, based on the first person.

I am trying to simplify the problem to make it easier for me. Alice and Bob take surveys. Alice goes first.

Q1: Do you like Tacos? Y/N
Q2: Do you like Pizza? Y/N
Q3: Do you like Spaghetti? Y/N

Alice answers Y,N,Y

What’s the probability Bob will give a specific answer for:
Q1, Q2, Q3

I figure if I can find the basic equation I can turn it into 40 different questions and hundreds of respondents.

Any thoughts?
 
Physics news on Phys.org
OK, after some memory jogging research I think I need to use correlation regression. Now, i just have to remember/relearn the equation for it and how to solve.

I can problem find Ʃ with some for of counting loop like
$n = number of respondents
for $n ($x1 +...
 
Ok, so I've simplified the problem down and changed some things to make it easier to think about.

Three possible questions:
Do you like Tacos? y/n
Do you like Burritos? y/n
Do you like Soup? y/n

After 100 completed surveys the results are
a People who liked tacos, burritos and soup: 30%
b People who liked tacos, burritos only: 35%
c People who liked tacos, soup only: 4%
d People who liked burritos and soup only: 3%
e People who liked tacos only: 7%
f People who liked burritos only: 6%
g People who liked soup only: 15%

The 101st respondent is asked "do you like tacos?"
The odds they will like tacos are 7%+4%+30%+35%=76%
They respond "yes"
Then they are asked : Do you like burritos?
The odds must be adjusted because now there are less possibilities.d, f and g must be removed.
So
a People who liked tacos, burritos and soup: 30%
b People who liked tacos, burritos only: 35%
c People who liked tacos, soup only: 4%
e People who liked tacos only: 7%

it only adds up to 76%. If we adjust the percentages above to total 100. So of the people that liked tacos...
a People who liked tacos, burritos and soup: 39.5%
b People who liked tacos, burritos only: 46.1%
c People who liked tacos, soup only: 5.1%
e People who liked tacos only: 9.3%

The respondent is answers no.

So of the respondents who liked tacos but not burritos...
c People who liked tacos, soup only: 35.4%
e People who liked tacos only: 64.6%

Thus, there is a higher likely hood that the respondent will not like soup.



Now I just have to think about how to apply this better. For some reason, maybe my business stats class, I think I need least squares and a regression analysis...
 
After consulting my a statistics professor, I will be using ANOVA or regression analysis. I may not use php as it is a pain to host.
 

Similar threads

Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
3
Views
2K
  • · Replies 131 ·
5
Replies
131
Views
10K
  • · Replies 29 ·
Replies
29
Views
6K
  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 4 ·
Replies
4
Views
6K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
823