The End of Theory: The Data Deluge Makes the Scientific Method Obsolete

Click For Summary

Discussion Overview

The discussion revolves around the implications of a data-driven approach to science, particularly in light of an article suggesting that the abundance of data may render traditional scientific methods obsolete. Participants explore the role of data mining, machine learning, and the necessity of theoretical frameworks in understanding scientific phenomena.

Discussion Character

  • Debate/contested
  • Exploratory
  • Technical explanation

Main Points Raised

  • Some participants argue that the scientific method will not be replaced due to its predictive nature, emphasizing the importance of explanation over mere data collection.
  • Others contend that the article presents a valid argument regarding the potential of data-driven predictions, questioning the need for theoretical models if sufficient data exists.
  • There is a discussion about the role of data mining in science, with some suggesting it is becoming increasingly applicable, particularly in fields like medical imaging.
  • Some participants express concern about the 'black box' nature of machine learning, noting that while algorithms can predict outcomes, understanding their workings remains challenging.
  • One participant highlights the historical context of quantum mechanics, suggesting that similar approaches could be applied to modern scientific inquiries using machine learning.
  • Another participant raises the issue of data sufficiency, arguing that while algorithms can analyze large datasets, the lack of comprehensive data in certain areas may limit their effectiveness.

Areas of Agreement / Disagreement

Participants express a range of views, with no clear consensus on the validity of the article's claims. Some support the idea that data-driven approaches can complement traditional methods, while others firmly defend the necessity of theoretical frameworks in scientific inquiry.

Contextual Notes

Participants note limitations regarding the understanding of machine learning algorithms and the sufficiency of data for developing new scientific insights. The discussion reflects a variety of perspectives on the balance between data collection and theoretical understanding.

CINA
Messages
60
Reaction score
0
Saw this interesting article and wondered what PF thought of it since its close to home. Personally I don't see the scienctific method being replaced anytime soon.

http://www.wired.com/science/discoveries/magazine/16-07/pb_theory/#/

Edit: Because of its predictive nature that is.
 
Last edited by a moderator:
Physics news on Phys.org
That article is complete garbage. It doesn't actually say anything or have a point besides "SOOO MANY BYTES ZOMG!1"

Whoever wrote it doesn't actually understand science. It's not about having data, it's about being able to explain the data. That's the whole goal, to understand what is going on, not just to see what is going on. Everybody knew that Maxwell's equations gave the correct answer regardless of reference frame. The problem was nobody knew why until Einstein came along and gave an explanation. If they'd just kept on collecting data, we wouldn't have gotten anywhere.
 
No, I see some validity in the argument. (That doesn't mean I agree, it just means it's a valid argument.)

The idea is: why do we need to have models of weather if we have enough data to simply predict what will happen without needing to know why it happens? (Bascially, we've created a Farmer's Almanac, writ large.)



Of course the downside is...

one can forsee a Twilight Zone or Trekkian clichéd future where citizens continue to use the tools that have worked for centuries but are at a loss if anything changes or if anything breaks down.
 
WarPhalange said:
That article is complete garbage. It doesn't actually say anything or have a point besides "SOOO MANY BYTES ZOMG!1"

Whoever wrote it doesn't actually understand science. It's not about having data, it's about being able to explain the data. That's the whole goal, to understand what is going on, not just to see what is going on. Everybody knew that Maxwell's equations gave the correct answer regardless of reference frame. The problem was nobody knew why until Einstein came along and gave an explanation. If they'd just kept on collecting data, we wouldn't have gotten anywhere.

I think its more to interpret the data in a new way, which the computer can't do. "Data in, data out" (computer)---or "Data in, correlation out" (theorist)---
 
I think this article is very much on target. Basically all this article is talking about is Data Mining. And I would agree that it's becoming, more and more, applicable to science. As an example, in undergrad I worked a coop at a place that did medical imaging research and one of the things they were talking about is getting computers to diagnose disease through Machine Learning. The gyst of it is is say that you want to be able to give a computer an MRI image and have it tell you whether the person has cancer. The best way to go about that is not to try and define what properties of an image codifies one with cancer, instead the best way is to just give a statistical learning algorithm a million MRI's and say these 500,000 don't have cancer and these 500,000 do, and let it figure out what the difference is. A lot of people have a problem with the 'black box' aspect of it but I think it's going to become, more and more, a standard tool in an ever growing list of applications
 
Data mining, right! Closely related to this post.

The deluge of studies, articles and records would enhance the tendency to select what seems to be supporting and to ignore what seems contradictory.
 
maverick_starstrider said:
I think this article is very much on target. Basically all this article is talking about is Data Mining. And I would agree that it's becoming, more and more, applicable to science. As an example, in undergrad I worked a coop at a place that did medical imaging research and one of the things they were talking about is getting computers to diagnose disease through Machine Learning. The gyst of it is is say that you want to be able to give a computer an MRI image and have it tell you whether the person has cancer. The best way to go about that is not to try and define what properties of an image codifies one with cancer, instead the best way is to just give a statistical learning algorithm a million MRI's and say these 500,000 don't have cancer and these 500,000 do, and let it figure out what the difference is. A lot of people have a problem with the 'black box' aspect of it but I think it's going to become, more and more, a standard tool in an ever growing list of applications

Okay, now how does that apply to finding new science?
 
rewebster said:
I think its more to interpret the data in a new way, which the computer can't do. "Data in, data out" (computer)---or "Data in, correlation out" (theorist)---

Not quite true; there are classes of computer algorithms (neural networks, genetic algorithms etc) that are very good at predicting the outcome of future experiments if you train them well (i.e. give them lots of examples first); the interesting thing is that they tend to work well even when "normal" mathematical methods are very difficult to use which is why they are now being used in e.g. social sciences.
Moreover, it is often very hard to understand WHY it works even when the algorithm has been trained and you can look at what it is actually doing, it is therefore unlikely that a human would ever stumble over the methods "developed" by these algorithms.
 
WarPhalange said:
Okay, now how does that apply to finding new science?

Well don't forget, quantum mechanics' original development was guided by the 'correspondence principle' (basically let's find a framework that reproduce these new 'bizarre results' and reduces to classical mechanics at the macro level' and worry about the physical intuition and meaning later).

Plus instead of say attempting to deduce, for example, which combination of ingredients and such could potentially create a room temperature superconducter from physical intuition you could apply basically machine learning.

There was an article in Scientific American like a year ago about using evoluntionary algorithms (a type of machine learning) to develop circuits. Within a fraction of a second the algorithm was able to develop the ideal circuit for things like high pass and low pass filtering and within a couple hours was even able to develop special use circuit layouts that were only patented a couple years ago.
 
  • #10
maverick_starstrider said:
Well don't forget, quantum mechanics' original development was guided by the 'correspondence principle' (basically let's find a framework that reproduce these new 'bizarre results' and reduces to classical mechanics at the macro level' and worry about the physical intuition and meaning later).

Plus instead of say attempting to deduce, for example, which combination of ingredients and such could potentially create a room temperature superconducter from physical intuition you could apply basically machine learning.

Okay, but you'd have to give it a bunch of models and things we don't even understand yet. What I mean is, you'd have to give it test data. Wood = bad conductor. Iron = better. HgBa2Ca2Cu3O8 = awesome. But that's not enough data to make it deduce new materials.

Going back to the cancer analogy, you fed 1million different data points to the computer. Here you have a handful.

There was an article in Scientific American like a year ago about using evoluntionary algorithms (a type of machine learning) to develop circuits. Within a fraction of a second the algorithm was able to develop the ideal circuit for things like high pass and low pass filtering and within a couple hours was even able to develop special use circuit layouts that were only patented a couple years ago.

Yeah, that's pretty awesome, but like I said we don't fully understand why superconductors do what they do (at least the high-temp ones). Circuit design is using what we know very well to create new things. Trial and error will get you there eventually and you can test whether you are there or not because we have a very good understanding of circuits. We can't simply model a new material because we're not even sure why the ones we already have work the way they do.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
Replies
4
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K
Replies
9
Views
2K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 5 ·
Replies
5
Views
5K
  • · Replies 24 ·
Replies
24
Views
6K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K