Comparing Non-Uniform Data Sets

AI Thread Summary
The discussion revolves around analyzing non-uniform data sets from student assignments reviewed by multiple teachers using criteria A, B, and C on a non-linear scale. The main challenge is how to normalize the data when some assignments receive more reviews than others, with suggestions including scaling up single reviews to match multiple ones or selecting a single review per assignment. The idea of using medians instead of averages is proposed to maintain meaningful whole number values. Additionally, visual representation methods like grid charts are suggested to summarize the data effectively, especially given the large number of students involved. The conversation highlights the need for careful data handling to ensure accurate representation of the reviews.
JPierce
Messages
6
Reaction score
0
I'm not sure my title is very descriptive, but I tried my best. I also hope I am posting this in the right forum. If not, please let me know. (I thought it might be better posted in the social sciences forum.)

I have a project where I am analyzing the results of multiple reviewers on a set of items. I am unsure as to the proper method of normalizing the data.

In essence, here is the problem:

We have a large stack of assignments turned in by students, with one assignment turned in by each student. Teachers analyzed each assignment according to three criteria, which I will call A, B, and C. Teachers values for each of these criteria on a scale from 1 to 5. However, this scale is not linear, so A=4 is not twice as "big" as A=2.

I simply want to display a summary of the results using (say) a histogram. I have no interest in calculating summary statistics because the numerical values for each criterion are purely denumerable -- that is, A = 2.4 (which could correspond to say grade level) is meaningless.

So far, so good. But some of the assignments were reviewed by up to five teachers. Others were reviewed by only one.

So one assignment turned in by (say) Jimmy may have the following reviews from five individual teachers:

A = 3; B = 1; C = 4
A = 3; B = 2; C = 4
A = 3; B = 1; C = 2
A = 2; B = 3; C = 4
A = 3; B = 2; C = 4

Another assignment turned in by Mary may only have A = 3; B = 1; C = 4 as measured by a single teacher.

So, how do we handle the fact that some assignments have more reviews than others? We could just scale up the number of reviews to a common value. In other words, we could pretend that Mary turned in five identical assignments, that is,

A = 3; B = 1; C = 4
A = 3; B = 1; C = 4
A = 3; B = 1; C = 4
A = 3; B = 1; C = 4
A = 3; B = 1; C = 4

Somehow that doesn't seem quite right. And it would foul up the precision of the results.

Another idea is to whittle down the number of reviewers to 1 for each assignment, but I have no good criteria for selecting the one sample to keep.

Any ideas?

If I have left out important info, just let me know.
 
Physics news on Phys.org
Most reasonable would seem that every student gets an A, B and C grade and each grade would be the average of the grades that the different teachers gave.

So that you would have:

Jimmy: A=2.8, B=1.8, C = 3.6
Marry: A=3.0, B=1.0, C=4.0
 
JPierce said:
I have no interest in calculating summary statistics because the numerical values for each criterion are purely denumerable -- that is, A = 2.4 (which could correspond to say grade level) is meaningless.

Oh, I am not really sure what you mean here, but it seems to indicate that only whole numbers are meaningful. In that case take medians instead of averages. So that you would have:

Jimmy: A=3, B=2, C=4
Marry: A=3, B=1, C=4
 
JPierce said:
...We have a large stack of assignments turned in by students, with one assignment turned in by each student. Teachers analyzed each assignment according to three criteria, which I will call A, B, and C. Teachers values for each of these criteria on a scale from 1 to 5. However, this scale is not linear, so A=4 is not twice as "big" as A=2.

I simply want to display a summary of the results using (say) a histogram...But some of the assignments were reviewed by up to five teachers. Others were reviewed by only one.

It sounds kind of similar to the Collaborative Filtering problem made famous by the Netflix Prize, but here if the number of students is not too large it should be possible to present all the data with a few charts.

For example the scores for criteria A could be represented by an Nx5 grid: number the students 1 to N according to some overall rating (such as average total score) and make the colour/brightness of cell (i,j) according to how many reviewers gave student i score j. Then repeat for criteria B and C (so all the data is in 3 grid charts or they could be combined into one with RGB colour mix). I don't know if any packages specifically do this type of chart but it should be possible in Excel using conditional formatting.
 
Thanks for all suggestions.

The number of students involved is very large -- tens of thousands.

I will look into the Netflix problem.

Using the median might work.

If anyone has more suggestions, please offer them. I will continue reading responses.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...

Similar threads

Replies
9
Views
2K
Replies
30
Views
2K
Replies
18
Views
2K
Replies
1
Views
1K
Replies
14
Views
2K
Replies
1
Views
1K
Replies
40
Views
8K
Back
Top