Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Data cooking/data rigging

  1. Jul 19, 2010 #1
    I'm not sure where I should post this question, I will try here.

    It's a linguistic doubt. I am not a native English speaker and I don't know the difference between 'data cooking' and 'data rigging'. Which one sounds more offensive or can be considered a more serious fault for a scientist?

    How would you call in English some method of data processing and analysis not totally objective and influenced by the result we would like to obtain?. I think the author don't want to lie but is using an incorrect trick to get misleading good results. I'd like to be clear but not very rude.

    Last edited: Jul 19, 2010
  2. jcsd
  3. Jul 19, 2010 #2


    User Avatar
    Gold Member

    Those two terms might as well be synonymous. They're both euphamisms so the specific intended meaning is ambiguous. The conclusion in both cases is simply that the data has been deliberately manipulated and cannot be trusted. Exactly what euphamism is used doesn't really seem to matter.

    Re-reading your post, I am under the impression that you plan to write to the author and ... well ... accuse him of manipulating the data.

    Any label you use will lead to an interpretation of rudeness -especially if it contains the accusation that the data was deliberately manipulated. If you want to be not rude then avoid using labels at all; just tell him that you think his data analysis is flawed. This leaves him an "out", in that you are not directly accusing him of deliberately cooking his data.
  4. Jul 19, 2010 #3
    Thanks for your reply, Dave

    Actually, what I want to do is to call attention to what I consider an incorrect method often used by many people, without adressing to a concrete person. I don't want to accuse all this people of being deliberately liying, but I think they are in some way liying although not deliberately.
    Last edited: Jul 19, 2010
  5. Jul 19, 2010 #4


    User Avatar

    Staff: Mentor

    There is also the term "cherry picking" which means the data itself might be correct, but the picking of certain data points and twisting that data to make it look like it means something different in order to support your hypothesis is unethical.
    Last edited: Jul 19, 2010
  6. Jul 19, 2010 #5
    Another appropriate term is data mining but cherry picking as suggested above is more clear. I don't believe the terms you suggested in your original post communicate what you are trying to say.
  7. Jul 19, 2010 #6


    User Avatar
    Gold Member

    Ooh. Good one.

    This is not my understanding of data mining.

    I thought data mining simply meant deep, number-crunchy processing of data in search of patterns.

    As an example, one might look at data much closer than was originally intended. In company of 10,000 people one might find some very interesting emergent data that was not apparent from the individual data points - say, a disproportionate number of employees at a military technology vendor are correlated with long distance overseas calls with hostile countries.

    Nothing wrong with the data or the methods it is subjected to. i.e. in my understanding, data mining is not the term that the OP is looking for.
  8. Jul 19, 2010 #7
    This is true, however if you've ever followed the climate audit blog, Steve McIntyre uses the term to describe the use of principle component analysis to to extract a hockey stick shape from the data used in mbh98. The point being that the technique amplifies low power noise and therefore by selecting a region with an apartment upward trend the application of the technique gives the desired result. The point being the word mining is used because we are digging though the data to get the desired result rather then trying to find a non biased vantage point.

    Even more so the proxies selected by the technique were highly correlated with CO2 and thus established the desired correlated between CO2 and temperature. I do not know if this use of the word is limited to McIntyre's blog or has a wider usage but the term cherry picking is certainly widely used.
  9. Jul 19, 2010 #8
    Thanks a lot for the comments.

    I knew the term "data mining" with the meaning explained by Dave:

    I didn't know the term "cherry picking", but after searching a bit on the web, I think it refers to using only the data that support an hypothesis and disregard others, while in my case the problem is the analysis rather than the selection of the data.

    Maybe I should take Dave's advice and avoid any label.
  10. Jul 19, 2010 #9
    We'll both the data and the method of analysis are items which can be cherry picked.
  11. Jul 19, 2010 #10


    User Avatar
    Gold Member

    Analagously, I could go to the local library to dig through the data there to get my desired result. But that does not make "going to the library" a term with negative or dishonest connotations.

    Whereas cooking data, rigging data and cherry-picking are all distinctly negative and dishonest.
  12. Jul 19, 2010 #11


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Cherry mining.
  13. Jul 19, 2010 #12


    User Avatar
    Gold Member


    I see your problem. Your tree is upside down.

  14. Aug 15, 2010 #13
    I have to observe that I understand a subtle diffrence between 'cooking' and 'rigging' something.

    To me cooking implies falsifying or hiding data after the event as in an accountant 'cooking the books' to present a false financial picture.

    On the other hand rigging implies prearranging something so the outcome will be skewed in some desired fashion as in 'loading the dice'. I don't think anyone would describe this as cooking the dice, but may use I have heard the term rigging the dice.

    The OP may also be also interested in the following distinction.

    Tax evasion is a crime.

    Tax avoidance is common sense
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook