Issue 17909
Dataset page stat overview does not count all kingdom 'unknown' records
17909
Reporter: jlegind
Assignee: bko
Type: Bug
Summary: Dataset page stat overview does not count all kingdom 'unknown' records
Priority: Major
Resolution: Duplicate
Status: Closed
Created: 2015-10-23 12:41:44.131
Updated: 2017-10-10 11:12:14.231
Resolved: 2017-10-10 11:12:14.198
Description: http://www.gbif.org/dataset/7bd65a7a-f762-11e1-a439-00145eb45e9a/stats
See Basis of record by Kingdom
The total count is not the sum of all the Kingdom terms. Basically we are 100,000 records short because only 'incertae sedis' is counted and not records with empty kingdom value.]]>
Author: mblissett
Created: 2015-12-30 12:25:56.397
Updated: 2015-12-30 12:25:56.397
There are at least two parts to this:
- Some names are in the incertae sedis kingdom, which is shown as unknown in the table. It would be better if this were labelled 'incertae sedis' or 'unplaced'. Occurrences in this row match to the backbone, but we don't know to what kingdom the name belongs.
- Some occurrences don't provide a scientific name, so the taxon key is null. This should be showing in the interpretation issues pie chart as TAXON_MATCH_NONE. These are more deserving of the 'unknown' row in the table.
- And I suppose there are cases where only a scientific name is provided, but it doesn't match to the backbone.
http://www.gbif.org/dataset/4e22dd77-6ba1-4256-a692-64fecdf38ea7/stats is an even more drastic example -- almost 30% of the dataset doesn't have a scientific name in the DWCA, but it's not at all clear from that page: http://www.gbif.org/occurrence/search?DATASET_KEY=4e22dd77-6ba1-4256-a692-64fecdf38ea7&ISSUE=TAXON_MATCH_NONE
It is much clearer with this Chrome extension: https://chrome.google.com/webstore/detail/gbif-dataset-metrics/kcianglkepodpjdiebgidhdghoaeefba / https://github.com/Datafable/gbif-dataset-metrics which we should look at when redesigning this page.