How Can Scientists Effectively Gather Data for Publishing?

  • Thread starter Thread starter ice109
  • Start date Start date
  • Tags Tags
    Data Publishing
Click For Summary

Discussion Overview

The discussion revolves around the challenges of gathering data for research publication, particularly focusing on the practice of extracting data from graphical representations in academic papers. Participants explore the expectations of researchers regarding data collection methods, the feasibility of contacting authors for original data, and the tools available for digitizing data from graphs.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant expresses frustration over being asked to extract data from a graph, questioning the precision of such methods.
  • Another participant suggests using software tools to digitize data from scanned graphs, indicating that this is a common practice.
  • Some participants share experiences of successfully contacting authors for data, while others caution that authors may not always be willing or able to provide it.
  • Concerns are raised about the potential lack of organization in authors' data records, which could complicate data retrieval.
  • Participants discuss the historical context of data collection methods, noting that older techniques involved manual measurements from physical graphs.

Areas of Agreement / Disagreement

There is no consensus on the reliability of extracting data from graphs or the likelihood of obtaining original data from authors. Multiple competing views exist regarding the expectations and practices in data gathering for research.

Contextual Notes

Participants highlight limitations such as the potential for imprecise data extraction methods and the variability in authors' willingness to share data. There is also mention of the challenges posed by authors' data management practices.

Who May Find This Useful

This discussion may be useful for researchers, interns, and students involved in data collection and publication in scientific fields, particularly those facing similar challenges in data extraction and communication with authors.

ice109
Messages
1,708
Reaction score
6
I'm interning right now for research electrical engineer something something and he's got me gathering data for him. One thing he wanted is for me to get a bunch of breakdown voltages for HTS dielectrics. Fine good I've gotten some. Now he plops down this new list of polymers with nano composites and all I have is papers written on these things. So this one paper has nothing but a small small plot of the Weibull Distribution for these things. The paper isn't about the breaKdown voltage specifically, it's about other stuff. This information is ancillary.

I told him that the paper doesn't have any real data and he tells me that he expects me to grab a ruler and a sharp pencil and mine the data out of the 2"x2" plot . He tells me that that is how it is done but I can't believe it. That's so imprecise it's ridiculous. So then I asked him if getting in contact with the guys who published the papers might help and he said that not likely because as soon as you publish you throw out the data? This can't be true.

So fellows the two 13$ questions are: are people seriously expected to mine data from silly graphs and will contacting the writers be of no avail?
 
Physics news on Phys.org
ice109 said:
So fellows the two 13$ questions are: are people seriously expected to mine data from silly graphs and will contacting the writers be of no avail?

Nope. I have done that myself. However, it doesn't mean that you would get it. Most authors will share the data with you, provide you explain what they are being used for, and that if you publish anywhere, you acknowledge them for sharing their data.

So then I asked him if getting in contact with the guys who published the papers might help and he said that not likely because as soon as you publish you throw out the data? This can't be true.

Your supervisor must have been schooled at the Hendrik Schon's School of Experimental Data Taking.

Zz.
 
robphy said:
Get a decent quality scan of the graph, then ...

http://digitizer.sourceforge.net/
or
http://www.frantz.fi/software/g3data.php
or one from
http://www.ccp14.ac.uk/solution/hardcopy2data.htm

i can't believe that would be as accurate as i need it to be.

ZapperZ said:
Your supervisor must have been schooled at the Hendrik Schon's School of Experimental Data Taking.

Zz.
i don't know who that is but i'll ask him


... just kidding, google makes me think he would be insulted.
 
This program is also useful if you can get a good scan of the plot.

http://datathief.org/

Yes, people actually do this. Some people just don't want to share their actual data, or published it long before you could put the data tables online.
 
ice109 said:
So fellows the two 13$ questions are: are people seriously expected to mine data from silly graphs and will contacting the writers be of no avail?

Welcome to the world of real research!

You can ask for the data, but the authors may or may not give it do you. The authors may be under some sort of contractual or legal restrictions with the data, or they might be too busy, or they might be jerks. Also, there is a good chance that *they* don't have the data in any electronic format that will make sense to you. You could get a dozen files with the data scattered in five different places, and then you'll have to manually type it in anyway.

If they don't, then yes, you get a good quality scan of the data, and then type the numbers into the spreadsheet. Also, you may find that it is faster to just type in the data than to e-mail people. What often happens is that you e-mail someone, they don't respond, and you don't know what that means.

Also people have to do this with original data. Until CCD's came around people did astronomical measurements by getting a photographic plate and using rulers to measure them, and you still have to do this with old data. Now with CCD's and computer processing, people take a digital scan, click on the points that they are interested in, and this goes into a file.
 
ice109 said:
He tells me that that is how it is done but I can't believe it. That's so imprecise it's ridiculous.

It's old fashion, but you can get some pretty good precise numbers. The size of the dot and the errors in measurement are usually quite small.

So then I asked him if getting in contact with the guys who published the papers might help and he said that not likely because as soon as you publish you throw out the data? This can't be true.

Physics professors aren't the greatest record keepers. Assuming that they still have the data, it's in some file marked "random_data_23.txt" next to some file called "more_random_data.txt"
 
twofish-quant said:
Physics professors aren't the greatest record keepers. Assuming that they still have the data, it's in some file marked "random_data_23.txt" next to some file called "more_random_data.txt"

:smile: I always used file names that involved the date and run... but that would require finding it in my lab notebook, which was (admittedly) a mess.

To the OP, I did once successfully contact authors for their data. (I wanted to plot their data in addition to mine in a graph used only in my Master's thesis.) For the purpose you state (publishing in a book format) it seems best to contact the original researchers for permission to use their data (even if you need to scan it off with a data-grabber) and acknowledge them for the use of the data.
 

Similar threads

Replies
1
Views
2K
Replies
2
Views
3K
  • · Replies 25 ·
Replies
25
Views
3K
  • · Replies 85 ·
3
Replies
85
Views
17K
Replies
4
Views
2K
  • · Replies 23 ·
Replies
23
Views
7K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 10 ·
Replies
10
Views
4K
  • · Replies 1 ·
Replies
1
Views
4K
Replies
2
Views
2K