Astroinformatics: Learn More at AAS10

  • Thread starter Thread starter Simfish
  • Start date Start date
  • Tags Tags
    Field
Click For Summary

Discussion Overview

The discussion revolves around the emerging field of astroinformatics, exploring its relevance, origins, and the challenges associated with handling large datasets in astrophysics. Participants examine whether astroinformatics represents a significant new discipline or merely a rebranding of existing practices in data management and analysis within astrophysics.

Discussion Character

  • Debate/contested
  • Conceptual clarification
  • Technical explanation

Main Points Raised

  • Some participants suggest that astroinformatics is a new field, drawing parallels to bioinformatics, which emerged from the need to manage large datasets in biology.
  • Others argue that astroinformatics may simply be a buzzword for techniques that have been in use for years, particularly in applied mathematics and statistics.
  • One participant highlights the challenges faced by existing projects like SDSS in managing large datasets, questioning whether they have truly solved the data handling issues.
  • Another participant emphasizes the necessity of advanced computational skills to effectively manage and analyze terabytes of astrophysical data.
  • Concerns are raised about the accessibility of raw data and the lack of mechanisms for independent verification of data reduction processes in large projects.
  • Some participants propose that while existing problems may be addressed, the focus should shift to identifying new challenges and opportunities within the field.

Areas of Agreement / Disagreement

Participants express differing views on whether astroinformatics constitutes a new field or is simply a rebranding of established practices. There is no consensus on the effectiveness of current data handling methods in astrophysics, particularly regarding the SDSS project.

Contextual Notes

Participants note limitations in current data management practices, including issues with data accessibility, metadata availability, and the need for collaboration between astrophysicists and computer scientists. The discussion reflects ongoing uncertainties in the field.

Physics news on Phys.org


Looks like a buzzword attached to what people have been doing for years.
 


Yeah, seems like applied math or statistics would do the job for you well enough..
 


I think it came out of "bioinformatics." What happened was that with the genome mapping project, the biologists got whomped with massive amounts of data that they couldn't deal with, and so they needed to create this new subfield that combined CS and biology.

Looks like the same thing is happening with astrophysics. Also I think it's becoming obvious to people that we need some new research since the way that people have been handling data for years just doesn't work. If you have terabytes of data, you need some non-trivial CS skills to deal with it.
 


Are you suggesting that the work done by SDSS in handling large amounts of data doesn't work? They have 50 TB, maybe 100, and have been at it for a decade. I might have said that this is largely a solved problem.
 


Vanadium 50 said:
Are you suggesting that the work done by SDSS in handling large amounts of data doesn't work?

Not sure, but I think they are likely running into the same issues that computational astrophysicists are running into, and I've seen nothing that suggests that they've made any progress in data handling that the computational astrophysicists haven't run into.

They have 50 TB, maybe 100, and have been at it for a decade.

Sure and that will give you raw data. The trouble with raw data is that it's pretty much useless without tools to do data mining and visualization. Suppose you have 50 TB of data that is the result of the a simulation, and you want to run statistics. You end up spending a few weeks writing a program that hits the raw data files, and this program takes two days to run and after another two weeks of pulling your hair out, you finally get a graph.

Except then you want to run some other statistics, and you have to go through all over again. And then you find that the raw data is on one server, the analysis program is on another, and you are not going to FTP 50 TB of data over.

And then you find that all of the data is scattered against three or four files, with no metadata, and in order to do calibrations, you have to spend a few weeks e-mailing people trying to get information about what the data means.

Again, it's possible that SDSS has totally licked the problem, but I really, really, really, really doubt it. What they seem to be doing is doing the best with what tools are available and processing the data so that it's generally available for other scientists. What they don't see to have a mechanism of doing is to allow general access to the original raw data, and then have outside groups totally reproduce their data reduction.

You might reply, but that means that giving people access to 50TB of data, that's impossible! And my point is that making those things possible is exactly what astroinformatics is all about.
 
Last edited:


I am aware of the problem. HEP has it as well. But like I said, SDSS has had this problem for a decade, and they are a successful experiment. This seems to me to be a largely solved problem.
 


Vanadium 50 said:
I am aware of the problem. HEP has it as well. But like I said, SDSS has had this problem for a decade, and they are a successful experiment. This seems to me to be a largely solved problem.

When you've solved the old problems, the next step is to find new problems to solve.

For example, it would be really neat if you could put SDSS on a server and turn it into Google sky with steroids where you can write a n-body simulation and pull the initial conditions from the SDSS server. Or be able to do a database query like "give me the distribution of Mg++ line widths for all of the F2 class stellar objects within the Milky Way." That's not going to be possible without a lot of cooperation between astrophysicists and CS people.
 


Yes, but now you have moved away from "a whole new field of study", which is what this is being billed as, to "looking for interesting new things to do in an existing field". Not that the latter is bad - but I think that my description of a new buzzword being applied to what people have been doing fir years fits.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
1K
  • · Replies 9 ·
Replies
9
Views
2K
Replies
30
Views
3K
  • · Replies 12 ·
Replies
12
Views
3K
Replies
5
Views
2K
  • · Replies 6 ·
Replies
6
Views
1K
  • · Replies 15 ·
Replies
15
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 14 ·
Replies
14
Views
2K