Issue 16207

Source DwC-A no longer accessed for certain records of dataset (but not all)

16207
Reporter: peterdesmet
Type: Bug
Summary: Source DwC-A no longer accessed for certain records of dataset (but not all)
Priority: Unassessed
Resolution: Fixed
Status: Closed
Created: 2014-07-25 13:39:17.552
Updated: 2018-05-31 16:55:30.832
Resolved: 2018-05-31 16:55:30.805
        
        
Description: This record (http://www.gbif.org/occurrence/772118333) was last accessed on September 2013 and shows old information. Meanwhile, the dataset (a DwC-A) has been updated numerous times, and is reflected as being accessed on the bulk of the records, e.g. http://www.gbif.org/occurrence/868489783

What could be the reason that this particular record (and probably some others) is not updated?

I also noticed that the number of records in the source (IPT) is 3,574,201 while the number on GBIF is 3,574,347 (146 higher), so there is some discrepancy here.

Issue also recorded at: https://github.com/LifeWatchINBO/florabank1-occurrences/issues/1]]>
    


Author: kbraak@gbif.org
Comment: Relates to POR-2347
Created: 2014-07-29 11:55:16.816
Updated: 2014-07-29 11:55:16.816


Author: omeyn@gbif.org
Created: 2014-08-04 15:49:38.502
Updated: 2014-08-04 15:49:38.502
        
For this dataset in particular here are the crawl_id breakouts:

1	145
7	1
11	3574201

Which matches the 146 overcount gbif shows. This appears to be 146 deletions that need doing.