Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Cosmology raw data

  1. Nov 14, 2009 #1
    I was reading Roger Penrose's book and he mentions that there are huge amounts of raw data from experiments that haven't yet been fully analysed. Is there a way I can get my hands on this raw data, for cosmology experiments or any other experiments with large amounts of raw data? I know that very new data isn't available because researchers on that particular experiment are allowed "first run" at the data, but old data is fine by me.
     
  2. jcsd
  3. Nov 14, 2009 #2

    turbo

    User Avatar
    Gold Member

    There is a ton of raw data available. There's stuff from SDSS and other surveys, there's stuff that's been compiled (not necessarily analyzed) on Nasa's Extragalactic Database, Hyper LEDA and on and on. If you have a yen for statistical analysis of huge blocks of data, you can knock yourself out. This can be brain-numbing work - the kind of stuff that you'd like to enslave some grad students to do, but there are lots of observations that are available to you. Warning: you may spend a good deal of time trying to put measurements in compatible formats. Redshifts can be expressed in lots of reference frames, luminosities can be expressed in different bands, etc. If you're interested in looking for small systemic effects, you'll have to get the measurements expressed in compatible terms in order to detect them.
     
  4. Nov 14, 2009 #3
    Thanks turbo, I'm not yet at the stage where I'd know what the hell to do with such data but I aim to start "playing around" with it next summer when I have time to learn statistics/cosmology in more depth. I think it'd be pretty fun though, not mind numbing, surely once you've written the code you just let it run and see what it finds?
     
  5. Nov 15, 2009 #4

    Chalnoth

    User Avatar
    Science Advisor

    The primary problem with dealing with raw data is systematic errors. This means that to actually make good use of raw data, you have to really understand not only the physics and the statistical analysis techniques used, but also the hardware used to collect the data.

    Usually, of course, the observational teams will do as much as they can do to ensure that the systematic effects are taken care of. But it's generally a good idea to read their papers in detail to really get an idea of what's going on.

    If you want to get your elbows into this, I'd suggest starting with SDSS: http://www.sdss.org/
     
  6. Nov 15, 2009 #5
    Perhaps much, if not just about all, raw data of the type that you want, is considered proprietary, and NOT available for private use. There may well be serious computer security concerns about its possible transfer also. LOL
     
  7. Nov 15, 2009 #6

    Chalnoth

    User Avatar
    Science Advisor

    This is the case in some circles. Any data that is collected as part of a NASA mission must be released to the public, however. And many other astronomy groups are moving in the same direction.
     
  8. Nov 15, 2009 #7
    So then NASA, that must have received many thousands of terrabytes of raw data in total, is obligated to 'hand it over' to me, and any others in the USA?
     
  9. Nov 15, 2009 #8

    Chalnoth

    User Avatar
    Science Advisor

    It's in the public domain, and usually available online. See here for one area of NASA research:
    http://lambda.gsfc.nasa.gov/

    Bear in mind that not all of the data that is made public due to this agreement is housed in NASA servers, as they also fund lots of other observational teams, who are also then bound by the same agreement. And yes, there is a heck of a lot of data available.
     
  10. Nov 15, 2009 #9
    Yes.

    You can start at http://lambda.gsfc.nasa.gov/

    Data that NASA maintains is covered under the Freedom of Information Act. Also OMB Circular A-110 Subpart A(d)(1)(2) requires that non-profit organizations receiving federal grant money make their research data available under the FOIA, although the can charge for the costs of copying data.

    http://www.whitehouse.gov/omb/rewrite/circulars/a110/a110.html [Broken]

    What typically happens under with NASA space missions is that the research group that was the main group involved with the mission gets the privilege of publishing the first paper using the data, but once that paper is published the data is made available to everyone else.
     
    Last edited by a moderator: May 4, 2017
  11. Nov 16, 2009 #10

    Wallace

    User Avatar
    Science Advisor

    As others have indicated, the process of comparing different data sets in a comparable way is extremely non trivial. Every survey has different selection effects, error budgets etc etc. That's before you get across the issue that for cosmology to say anything meaningful about physics, you need to have a robust prediction to compare to (i.e. given a physics, what would we see?) and for almost all usefull observations, this prediction itself is a not entirely solved problem (though for some data sets such as the CMB we are pretty much there for all but the most esoteric models). For instance we know very well what the LCDM model predicts, for a wide range of parameter values, the number and nature of dark matter structures to expect. We know this from detailed simulations. However, translating that into a genuinely comparable prediction to real observations, given the unknown details of galaxy formation and evolution, is far from solved. We've come a long way but we aren't there yet.

    I don't want to dissuade you, cosmology is very interesting at any level of investigation, but don't take the quote "there are huge amounts of raw data from experiments that haven't yet been fully analysed" to mean "there is heaps of low hanging fruit out there just waiting to be picked off". All the cosmological data that has come in has been analysed already in all kinds of ways, even if there are even more things you could think of doing that haven't been done yet. Yes there could be (and probably are) surprises lurking data already taken, but if they still remain hidden it's not for lack of trying.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook