How Can Scientists Effectively Gather Data for Publishing?

  • Thread starter Thread starter ice109
  • Start date Start date
  • Tags Tags
    Data Publishing
Click For Summary
SUMMARY

This discussion centers on the challenges faced by researchers in gathering data for publication, specifically regarding the extraction of breakdown voltages from graphs in academic papers. Participants confirm that while it is common practice to mine data from graphical representations, the accuracy of this method is often questioned. Tools such as Digitizer, G3Data, and DataThief are recommended for converting graphical data into usable formats. Additionally, contacting original authors for data is encouraged, although responses may vary based on the authors' circumstances.

PREREQUISITES
  • Understanding of breakdown voltage concepts in materials science.
  • Familiarity with data extraction techniques from graphical representations.
  • Knowledge of software tools for digitizing data from plots.
  • Basic communication skills for reaching out to researchers for data requests.
NEXT STEPS
  • Research how to effectively use Digitizer for data extraction from graphs.
  • Learn about G3Data and its application in converting graphical data.
  • Explore best practices for contacting authors for data sharing and permissions.
  • Investigate the historical context of data collection methods in scientific research.
USEFUL FOR

Researchers, graduate students, and anyone involved in data collection and publication in scientific fields, particularly in materials science and engineering.

ice109
Messages
1,707
Reaction score
6
I'm interning right now for research electrical engineer something something and he's got me gathering data for him. One thing he wanted is for me to get a bunch of breakdown voltages for HTS dielectrics. Fine good I've gotten some. Now he plops down this new list of polymers with nano composites and all I have is papers written on these things. So this one paper has nothing but a small small plot of the Weibull Distribution for these things. The paper isn't about the breaKdown voltage specifically, it's about other stuff. This information is ancillary.

I told him that the paper doesn't have any real data and he tells me that he expects me to grab a ruler and a sharp pencil and mine the data out of the 2"x2" plot . He tells me that that is how it is done but I can't believe it. That's so imprecise it's ridiculous. So then I asked him if getting in contact with the guys who published the papers might help and he said that not likely because as soon as you publish you throw out the data? This can't be true.

So fellows the two 13$ questions are: are people seriously expected to mine data from silly graphs and will contacting the writers be of no avail?
 
Physics news on Phys.org
ice109 said:
So fellows the two 13$ questions are: are people seriously expected to mine data from silly graphs and will contacting the writers be of no avail?

Nope. I have done that myself. However, it doesn't mean that you would get it. Most authors will share the data with you, provide you explain what they are being used for, and that if you publish anywhere, you acknowledge them for sharing their data.

So then I asked him if getting in contact with the guys who published the papers might help and he said that not likely because as soon as you publish you throw out the data? This can't be true.

Your supervisor must have been schooled at the Hendrik Schon's School of Experimental Data Taking.

Zz.
 
robphy said:
Get a decent quality scan of the graph, then ...

http://digitizer.sourceforge.net/
or
http://www.frantz.fi/software/g3data.php
or one from
http://www.ccp14.ac.uk/solution/hardcopy2data.htm

i can't believe that would be as accurate as i need it to be.

ZapperZ said:
Your supervisor must have been schooled at the Hendrik Schon's School of Experimental Data Taking.

Zz.
i don't know who that is but i'll ask him


... just kidding, google makes me think he would be insulted.
 
This program is also useful if you can get a good scan of the plot.

http://datathief.org/

Yes, people actually do this. Some people just don't want to share their actual data, or published it long before you could put the data tables online.
 
ice109 said:
So fellows the two 13$ questions are: are people seriously expected to mine data from silly graphs and will contacting the writers be of no avail?

Welcome to the world of real research!

You can ask for the data, but the authors may or may not give it do you. The authors may be under some sort of contractual or legal restrictions with the data, or they might be too busy, or they might be jerks. Also, there is a good chance that *they* don't have the data in any electronic format that will make sense to you. You could get a dozen files with the data scattered in five different places, and then you'll have to manually type it in anyway.

If they don't, then yes, you get a good quality scan of the data, and then type the numbers into the spreadsheet. Also, you may find that it is faster to just type in the data than to e-mail people. What often happens is that you e-mail someone, they don't respond, and you don't know what that means.

Also people have to do this with original data. Until CCD's came around people did astronomical measurements by getting a photographic plate and using rulers to measure them, and you still have to do this with old data. Now with CCD's and computer processing, people take a digital scan, click on the points that they are interested in, and this goes into a file.
 
ice109 said:
He tells me that that is how it is done but I can't believe it. That's so imprecise it's ridiculous.

It's old fashion, but you can get some pretty good precise numbers. The size of the dot and the errors in measurement are usually quite small.

So then I asked him if getting in contact with the guys who published the papers might help and he said that not likely because as soon as you publish you throw out the data? This can't be true.

Physics professors aren't the greatest record keepers. Assuming that they still have the data, it's in some file marked "random_data_23.txt" next to some file called "more_random_data.txt"
 
twofish-quant said:
Physics professors aren't the greatest record keepers. Assuming that they still have the data, it's in some file marked "random_data_23.txt" next to some file called "more_random_data.txt"

:smile: I always used file names that involved the date and run... but that would require finding it in my lab notebook, which was (admittedly) a mess.

To the OP, I did once successfully contact authors for their data. (I wanted to plot their data in addition to mine in a graph used only in my Master's thesis.) For the purpose you state (publishing in a book format) it seems best to contact the original researchers for permission to use their data (even if you need to scan it off with a data-grabber) and acknowledge them for the use of the data.
 

Similar threads

Replies
1
Views
2K
Replies
2
Views
3K
  • · Replies 25 ·
Replies
25
Views
3K
  • · Replies 85 ·
3
Replies
85
Views
16K
Replies
4
Views
2K
  • · Replies 23 ·
Replies
23
Views
7K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 10 ·
Replies
10
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
2
Views
1K