I don't mean as in how good their choices were or anything, just a clarification of what they chose. Like putting what they chose in words for clarification of what the scores mean. Just to make the task easier and less risk of misunderstanding.
Yes, I'll try that :)
By "reference standard" do...
All right, thanks guys! I think I'll go with the regression model then, and show all three videos at once each time with a 1-10 scale for each. That way I think the scoring might be more consistent... I think maybe it'd be an idea to do some kind of normalization to make the relative scores more...
Ah, I think the second answer implicitly does that erroneously here: https://stats.stackexchange.com/a/281020/218515. Also in several other studies, people seem to do it, like here http://www.zhizheng.org/papers/is2015_oliver_control.pdf. Tempting since it's much easier to get a lot of...
Dale: Ah yes, you could say I was basically thinking of doing a one-point Likert (prefer A, no-preference, and prefer B) at first, but several points could maybe make the results more significant? And I guess I can consider each observation independent then even though it's the same users voting...
I'm still designing it, and I figured I'd plan it out completely before conducting the survey.
Nope, but they're all in the same neutral mood and speed really, and recorded by me.
Hi,
I've chosen to compare the results of three different machine-learning models (or well, one is the ground truth) outputting the mouth animation of a talking 3D model for my thesis using a user preference study.
Users will see two videos at a time, each from one of the models, and indicate...