Issue 16204
occ counts 7x higher than expected in Ohio State University Fish Division (OSUM)
16204
Reporter: mdoering
Assignee: jlegind
Type: Bug
Summary: occ counts 7x higher than expected in Ohio State University Fish Division (OSUM)
Priority: Critical
Resolution: Fixed
Status: Closed
Created: 2014-07-24 18:15:57.766
Updated: 2018-05-31 16:33:39.383
Resolved: 2018-05-31 16:33:39.304
Description: Originally report by Alex Thompson on api user list.
-----
Hi Alex,
at least the counts in our metrics cube and our solr search index line up and we do indeed have that large number of records in our index:
http://api.gbif.org/v1/occurrence/count?datasetKey=813b435e-f762-11e1-a439-00145eb45e9a
http://api.gbif.org/v1/occurrence/search?datasetKey=813b435e-f762-11e1-a439-00145eb45e9a
So it is no bug in the API, but we need to figure out why we have that inflation.
Usually this is caused by identifier/triplet changes, but we need to investigate with a bit more time to say anything more.
It might be related to the fact the the dwc archive of that dataset apparently does not even validate:
http://tools.gbif.org/dwca-reports/205-8707255612452827269.html
Rather strange as it is an IPT which reports your 98.439 records:
http://hymfiles.biosci.ohio-state.edu:8080/ipt/resource.do?r=osum-fish]]>
Author: jlegind@gbif.org
Created: 2014-09-11 16:10:28.631
Updated: 2014-09-11 16:10:28.631
#ct #collectioncode #institutioncode date_crawled
1 Fish OSUM 2013-12-16
87733 Fish OSUM 2013-09-07
374 Fish OSUM 2013-12-17
55 Insects OSUMT 2013-12-17
451382 Insects OSUMU 2013-12-17
51 Fishes Ohio State University - Fish Division, Columbus, OH (OSUM) 2014-03-15
98439 Fishes Ohio State University - Fish Division, Columbus, OH (OSUM) 2014-08-26
These are the counts of collection code and inst code by date crawled. THis sugggests that the original dataset was split into two and this is the Fish collection that came out of it.
Deletion is moving forward and publisher will be contacted.