Right to Big Data: Corporate Control or Equal Access?

  • Thread starter Thread starter zoobyshoe
  • Start date Start date
  • Tags Tags
    Big data Data
AI Thread Summary
Access to "big data" held by companies like Facebook and Google is a contentious issue in scientific research, as it limits verification and transparency. Researchers argue that without access to this data, the integrity of scientific findings is compromised, potentially leading to bad science and fraud. The corporate control over data may favor a select group of scientists, hindering equal opportunities for others. Calls for mandatory data release are growing, emphasizing that peer review is ineffective if the underlying data cannot be scrutinized. The debate highlights the tension between corporate privacy and the scientific community's need for transparency.
zoobyshoe
Messages
6,506
Reaction score
1,268
The Right to "Big Data."

"Troves of Personal Data, Forbidden to Researchers"
PALO ALTO, Calif. — When scientists publish their research, they also make the underlying data available so the results can be verified by other scientists.

At least that is how the system is supposed to work. But lately social scientists have come up against an exception that is, true to its name, huge.

It is “big data,” the vast sets of information gathered by researchers at companies like Facebook, Google and Microsoft from patterns of cellphone calls, text messages and Internet clicks by millions of users around the world. Companies often refuse to make such information public, sometimes for competitive reasons and sometimes to protect customers’ privacy. But to many scientists, the practice is an invitation to bad science, secrecy and even potential fraud.

The issue is that not every scientist is allowed access to "big data".

He added that corporate control of data could give preferential access to an elite group of scientists at the largest corporations. “If this trend continues,” he wrote, “we’ll see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right ‘connections’ to private data.”

Also, as it says in the first quote, there is no way to check on the papers based on exclusive-access "big data". They might well be fraudulent.

http://www.nytimes.com/2012/05/22/science/big-data-troves-stay-forbidden-to-social-scientists.html
 
Physics news on Phys.org


I don't believe they have a "right" to see the data. Although, for the purposes of research endeavors, these companies should be obligated to release the data to those who'd like to see the results.

But even though this is a social science issue currently, it will definitely spill over to the hard sciences where data should be released but companies rather hold out on the public sector. This is definitely an issue in my opinion, but I also have an issue with someone saying they have a right to see it.
 


phoenix:\\ said:
I don't believe they have a "right" to see the data. Although, for the purposes of research endeavors, these companies should be obligated to release the data to those who'd like to see the results.

But even though this is a social science issue currently, it will definitely spill over to the hard sciences where data should be released but companies rather hold out on the public sector. This is definitely an issue in my opinion, but I also have an issue with someone saying they have a right to see it.
The "right" is not to see the data in the first place, but to be able to verify it when you're checking the validity of a paper written by someone who was given access:

The chairman of the conference panel — Bernardo A. Huberman, a physicist who directs the social computing group at HP Labs here — responded angrily. In the future, he said, the conference should not accept papers from authors who did not make their data public. He was greeted by applause from the audience.

In February, Dr. Huberman had published a letter in the journal Nature warning that privately held data was threatening the very basis of scientific research. “If another set of data does not validate results obtained with private data,” he asked, “how do we know if it is because they are not universal or the authors made a mistake?
 


Once again confirming that rich people can afford more stuff than poor people.
 


It's all settled in the Freedom of Information Act. It has been subject to intense discussions in a branch of science that cannot be discussed here.
 


Andre said:
It's all settled in the Freedom of Information Act. It has been subject to intense discussions in a branch of science that cannot be discussed here.
Are you saying Google is now the government?
 


zoobyshoe said:
"Troves of Personal Data, Forbidden to Researchers"


The issue is that not every scientist is allowed access to "big data".



Also, as it says in the first quote, there is no way to check on the papers based on exclusive-access "big data". They might well be fraudulent.

http://www.nytimes.com/2012/05/22/science/big-data-troves-stay-forbidden-to-social-scientists.html

Looks like peer review will be relegated to "spell checking the paper". Whether it’s secret communications regarding global warming data or drug effectiveness studies, only the profession overseeing the publication can force the data issues, IMO. That which cannot be peer reviewed shouldn’t be published.
 


ThinkToday said:
That which cannot be peer reviewed shouldn’t be published.

I agree completely. This seems to be the big issue here. It sort of defeats the purpose of peer review if you can't well, have peers review it.
 


Isn't this also a problem with "black box" software that is often used in simulations and calculations? I know my undergrad research advisor complains about that sometimes (like when he used such a "black box" in his 2000 paper that I'm now expanding on). Seems to be a related, but older, issue.
 
  • #10


A company doesn't have to release data, but if they don't any work they "publish" should definitely be criticised and viewed with scepticism. They are essentially just PR and ads, not science.

For example, Rolls Royce suppress their data the time. They don't go about publishing results of it though...
 
Back
Top