Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Featured I Using big data to identify astronomocal data bias

  1. Apr 18, 2017 #1

    Chronos

    User Avatar
    Science Advisor
    Gold Member

    I've been following, albeit loosely, the use of big data to refine astronomical data. It has been frequently noted that astronomy is an excellent test ground for big data approaches. I'm led to wonder what kind of results have been achieved to date and how effective are these methods for detecting bias in data sets such as foreground contamination? Can it be used to test the parameter space of assumptions applied to data sets?
     
  2. jcsd
  3. Apr 18, 2017 #2
  4. Apr 19, 2017 #3

    Chronos

    User Avatar
    Science Advisor
    Gold Member

    The article evokes a sense of pareidolia, regarding big data. Intelligent life forms are predisposed to make associations between the unknown and familiar. It serves a vital survival role to anticipate danger by drawing parallels between new, unfamiliar data and data from past experience. Unanticipated and/or intense sensory inputs [e.g., loud noises] are autonomically processed as potential threats. Inputs reminiscent of pleasant experiences [e.g., music] are similarly processed. You cannot divorce the observer from this kind of bias because it is a hardwired response. I prefer to think the data can speak for itself. Correlations only suggest the possibility a data set varies due to something other than random noise. It could be a known factor, an unknown factor, or merely systematic error. That is what I expect big data analysis should be capable of discerning.
     
  5. Apr 19, 2017 #4
    I agree but that's a pretty thorough meta analysis it's been awhile since I read the article but as I recall the primary issue is confirmation bias not in assumptions but reflected in the data. They don't have sufficient variability over a large sample size. The logical conclusion being the data is being reported in a biased manner which supports the presumed hypothesis. There is a very important distinction between producing data and choosing post hoc a theory which is familiar to you and, as the author suggests due to a lack of experimental control allowing bias, most likely without intent, to influence the data reported. Simple experimental controls where the scientists are blind to the data they are analyzing and what it represents protects the integrity and validity of research. Humans as you eloquently stated are biased by nature which is why experiments should always be designed to control for it.
     
  6. Apr 20, 2017 #5

    Chronos

    User Avatar
    Science Advisor
    Gold Member

    I concur, but, see constraining the effects of human bias on the output as a motive for using big data analysis tools. I am less concerned with raw data than data which has been 'cleaned' to filter out systematics.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Using big data to identify astronomocal data bias
  1. Using Planck Data (Replies: 5)

  2. Universe: Data (Replies: 1)

Loading...