
In the infrequent occasion when it doesn’t work because the authors refuse to share the data, you’ll still learn something useful about how trustworthy their data can be assumed to be (i.e., not at all).

#Engauge digitizer publication software#
(Vector graphics may offer the illusion of lossless encoding, but that’s assuming no lossy steps were applied by the human or software at any step along the way - a dangerous assumption to make in practice.) When it works (and I expect it would most of the time), you’ll know with certainty that the data you have is exactly what the authors were working with rather than some approximation recovered by trying to reverse engineer an unknown sequence of human and algorithmic processes to convert the raw data into a figure. To email the authors and ask for the data. This is an amusing programming/hacking challenge, but my guess is that in 90% of cases the best solution lies in the realm of human affairs, and that’s simply Thanks to Martin and Massimo Ortolano, whose contributions inspired some of the remarks. Also, the corresponding author could be contacted, which often won't lead to success for many reasons such as unavailability (of data or author, after some time) or unwillingness. hiding individual graphs) would help, but is time-consuming and does only solve some of the problems.Īt first, it should of course be checked if the original numeric data are available, as required by some journals (unfortunately not in many fields). Using a vector graphics editor for preparation before rastering (e.g. Hence, rastering involves misinterpretation of data. In complex figures graphs could (i) cover each other up, (ii) overlap themself due to scatter and line thickness, and (iii) have varying sampling rate. The problem goes beyond precision in terms of reading out values (which could be resolved by rastering figures in high resolution and using the aforementioned tools). guaranteeing proper resampling) and journals don't always mess up, figures in appropriate quality should now and then be available. Since often high-quality plotting tools are used (e.g. The achievable accuracy of course depends on the quality of the figure, or more specifically on (i) how the figure was originally produced, and (ii) how it was processed during the publication process. This question goes beyond precision (see further remarks below) and also addresses an efficient and semi-automated workflow. Are there tools around which allow to directly digitize vector paths from figures (similar to the aforementioned methods)? Since publications are usually available in digital form and figures therein are often embedded as vector graphics, a more accurate digitization would be desirable. There are some very useful tools around to digitize such data, such as the web application WebPlotDigitizer, the app Engauge Digitizer or within the software Origin, but to my knowledge they only support raster images. Measurement data in publications is often provided only within figures, while the original data is not available.
