Two downloads of the same dataset differ in size?
Reporter: feedback bot
Summary: Two downloads of the same dataset differ in size?
Created: 2016-01-28 11:26:43.868
Updated: 2016-01-29 09:44:49.37
Resolved: 2016-01-28 12:25:19.844
Description: Hi GBIF,
I accidentally started two downloads of the same dataset in a few seconds interval (doi:10.15468/dl.hq24pa and doi:10.15468/dl.f60ih2). Strangely, the file size for those two downloads is quite different.
No big deal for me, but I thought you'd want to know!
Created: 2016-01-28 12:25:19.882
Updated: 2016-01-28 12:25:19.882
The user has downloaded the records while the dataset was being recrawled. The dataset now has 192k occurrences: http://www.gbif.org/dataset/1aaec653-c71c-4695-9b6e-0e26214dd817
(Compare with UAT, which — for the moment — has the old dataset: 75k occurrences: http://www.gbif-uat.org/dataset/1aaec653-c71c-4695-9b6e-0e26214dd817 )
I don't think there's a way to avoid this. Delaying downloads until any crawling of data they depend on could mean extremely long delays, and many downloads retrieve records from lots of datasets.
Should we show information on the crawl status on the dataset page?