What's the Difference Between Data Cooking and Data Rigging?

  • Thread starter Thread starter Gruxg
  • Start date Start date
  • Tags Tags
    Data
AI Thread Summary
The discussion centers on the linguistic differences between "data cooking" and "data rigging," particularly in the context of scientific integrity. Both terms imply manipulation of data, but their connotations may vary. "Data cooking" suggests altering or hiding data post-analysis, while "data rigging" implies pre-arranging data to achieve a desired outcome. The conversation highlights the importance of using precise language when addressing flawed data analysis methods without directly accusing individuals of dishonesty. Terms like "cherry picking" are introduced as alternatives, indicating selective use of data to support a hypothesis. The participants agree that avoiding labels may be the best approach to communicate concerns about data analysis methods respectfully. Overall, the thread emphasizes the need for clarity and tact in discussing scientific data integrity.
Gruxg
Messages
43
Reaction score
2
I'm not sure where I should post this question, I will try here.

It's a linguistic doubt. I am not a native English speaker and I don't know the difference between 'data cooking' and 'data rigging'. Which one sounds more offensive or can be considered a more serious fault for a scientist?

How would you call in English some method of data processing and analysis not totally objective and influenced by the result we would like to obtain?. I think the author don't want to lie but is using an incorrect trick to get misleading good results. I'd like to be clear but not very rude.

Thanks!
 
Last edited:
Physics news on Phys.org
Gruxg said:
I'm not sure where I should post this question, I will try here.

It's a linguistic doubt. I am not a native English speaker and I don't know the difference between 'data cooking' and 'data rigging'. Which one sounds more offensive or can be considered a more serious fault for a scientist?

How would you call in English some method of data processing and analysis not totally objective and influenced by the result we would like to obtain?. I think the author don't want to lie but is using an incorrect trick to get misleading good results. I'd like to be clear but not very rude.

Thanks!

Those two terms might as well be synonymous. They're both euphamisms so the specific intended meaning is ambiguous. The conclusion in both cases is simply that the data has been deliberately manipulated and cannot be trusted. Exactly what euphamism is used doesn't really seem to matter.

Re-reading your post, I am under the impression that you plan to write to the author and ... well ... accuse him of manipulating the data.

Any label you use will lead to an interpretation of rudeness -especially if it contains the accusation that the data was deliberately manipulated. If you want to be not rude then avoid using labels at all; just tell him that you think his data analysis is flawed. This leaves him an "out", in that you are not directly accusing him of deliberately cooking his data.
 
Thanks for your reply, Dave

Actually, what I want to do is to call attention to what I consider an incorrect method often used by many people, without adressing to a concrete person. I don't want to accuse all this people of being deliberately liying, but I think they are in some way liying although not deliberately.
 
Last edited:
Gruxg said:
Thanks for your reply, Dave

Actually, what I want to do is to call attention to what I consider an incorrect method often used by many people, without adressing to a concrete person. I don't want to accuse all this people of being deliberately liying, but I think they are in some way liying themselves.
There is also the term "cherry picking" which means the data itself might be correct, but the picking of certain data points and twisting that data to make it look like it means something different in order to support your hypothesis is unethical.
 
Last edited:
Another appropriate term is data mining but cherry picking as suggested above is more clear. I don't believe the terms you suggested in your original post communicate what you are trying to say.
 
Evo said:
There is also the term "cherry picking"
Ooh. Good one.

John Creighto said:
Another appropriate term is data mining...
This is not my understanding of data mining.

I thought data mining simply meant deep, number-crunchy processing of data in search of patterns.

As an example, one might look at data much closer than was originally intended. In company of 10,000 people one might find some very interesting emergent data that was not apparent from the individual data points - say, a disproportionate number of employees at a military technology vendor are correlated with long distance overseas calls with hostile countries.

Nothing wrong with the data or the methods it is subjected to. i.e. in my understanding, data mining is not the term that the OP is looking for.
 
DaveC426913 said:
This is not my understanding of data mining.

I thought data mining simply meant deep, number-crunchy processing of data in search of patterns.

As an example, one might look at data much closer than was originally intended. In company of 10,000 people one might find some very interesting emergent data that was not apparent from the individual data points - say, a disproportionate number of employees at a military technology vendor are correlated with long distance overseas calls with hostile countries.

Nothing wrong with the data or the methods it is subjected to. i.e. in my understanding, data mining is not the term that the OP is looking for.

This is true, however if you've ever followed the climate audit blog, Steve McIntyre uses the term to describe the use of principle component analysis to to extract a hockey stick shape from the data used in mbh98. The point being that the technique amplifies low power noise and therefore by selecting a region with an apartment upward trend the application of the technique gives the desired result. The point being the word mining is used because we are digging though the data to get the desired result rather then trying to find a non biased vantage point.

Even more so the proxies selected by the technique were highly correlated with CO2 and thus established the desired correlated between CO2 and temperature. I do not know if this use of the word is limited to McIntyre's blog or has a wider usage but the term cherry picking is certainly widely used.
 
Thanks a lot for the comments.

I knew the term "data mining" with the meaning explained by Dave:
http://en.wikipedia.org/wiki/Data_mining

I didn't know the term "cherry picking", but after searching a bit on the web, I think it refers to using only the data that support an hypothesis and disregard others, while in my case the problem is the analysis rather than the selection of the data.

Maybe I should take Dave's advice and avoid any label.
 
Gruxg said:
Thanks a lot for the comments.

I knew the term "data mining" with the meaning explained by Dave:
http://en.wikipedia.org/wiki/Data_mining

I didn't know the term "cherry picking", but after searching a bit on the web, I think it refers to using only the data that support an hypothesis and disregard others, while in my case the problem is the analysis rather than the selection of the data.

Maybe I should take Dave's advice and avoid any label.

We'll both the data and the method of analysis are items which can be cherry picked.
 
  • #10
John Creighto said:
The point being the word mining is used because we are digging though the data to get the desired result rather then trying to find a non biased vantage point.
Analagously, I could go to the local library to dig through the data there to get my desired result. But that does not make "going to the library" a term with negative or dishonest connotations.

Whereas cooking data, rigging data and cherry-picking are all distinctly negative and dishonest.
 
  • #11
Cherry mining.
 
  • #12
lisab said:
Cherry mining.

:smile:

I see your problem. Your tree is upside down.

:biggrin:
 
  • #13
I have to observe that I understand a subtle diffrence between 'cooking' and 'rigging' something.

To me cooking implies falsifying or hiding data after the event as in an accountant 'cooking the books' to present a false financial picture.

On the other hand rigging implies prearranging something so the outcome will be skewed in some desired fashion as in 'loading the dice'. I don't think anyone would describe this as cooking the dice, but may use I have heard the term rigging the dice.

The OP may also be also interested in the following distinction.

Tax evasion is a crime.

Tax avoidance is common sense
 
Back
Top