Issue 14483

how to check for wrong Kingdom data?

14483
Reporter: feedback bot
Assignee: kbraak
Type: Improvement
Summary: how to check for wrong Kingdom data?
Resolution: Fixed
Status: Resolved
Created: 2013-12-18 11:07:39.136
Updated: 2014-04-30 15:04:37.167
Resolved: 2014-04-30 15:04:37.135
        
        
Description: In the stats page I see that some of our records which are not shown as belonging to Kingdom Plantae.  They are reported as "Unknown:

http://www.gbif.org/dataset/1c334170-7ed1-11df-8c4a-0800200c9a66/stats

Kingdom value is set to the fixed value "Plantae" by our tapirlink data provider (it is not mapped from a field in our database). So I can't figure out the problem.

How can I retrieve those records from data portal?  I can't see a way to search for "Kingdom = Unknow" or "Scientificname IS NOT Plantae"

Thanks
]]>
    
Attachment PF-1386.xlsx


Author: kbraak@gbif.org
Created: 2013-12-18 12:26:59.948
Updated: 2013-12-18 12:26:59.948
        
Attached is the result set (run on Hive, against prod.occurrence), showing the 332 records not interpreted as having kingdom equal to Plantae or Fungi. It confirms there are either problems with the source data, or with our own interpretation. [~mdoering@gbif.org] would you mind taking a quick look to see if you can spot any problems with the source data?




    


Author: mdoering@gbif.org
Created: 2013-12-18 12:29:53.003
Updated: 2013-12-18 12:29:53.003
        
Good catch, this is indeed not obvious. The interpreted higher classification is based on the GBIF backbone taxon the record is matched with.
It seems there are some species we either cannot match at all or have no kingdom assigned in our backbone.

Not documented, but you can use the scientific name filter to not only search for species names but any higher taxon. To list all occurrences in this dataset which are linked to the unknown "incertae sedis" kingdom you can do this which yields surprising results:
http://www.gbif.org/occurrence/search?DATASET_KEY=1c334170-7ed1-11df-8c4a-0800200c9a66&TAXON_KEY=0

There are very common species in this list, e.g. Achillea millefolium, Filago pyramidata, Senecio, ... :

http://www.gbif.org/occurrence/180208054
http://www.gbif.org/occurrence/180185404
http://www.gbif.org/occurrence/234772797

I fail to immediately see why our backbone has no classification for those names, it is rather strange.
It is definitely a backbone bug, not a problem with our data!

Achillea millefolium:
http://www.gbif.org/species/3120060

    


Author: mdoering@gbif.org
Comment: Created a new jira for the nub issue: http://dev.gbif.org/issues/browse/POR-1373
Created: 2013-12-18 12:33:13.503
Updated: 2013-12-18 12:33:13.503


Author: mdoering@gbif.org
Comment: [~kbraak@gbif.org] can we focus in this jira on how to enable users to search occurrences for higher taxa and the unknown kingdom better?
Created: 2013-12-18 12:34:17.481
Updated: 2013-12-18 12:34:28.984


Author: kbraak@gbif.org
Created: 2013-12-18 13:52:54.267
Updated: 2013-12-18 13:52:54.267
        
Looks like the scientific name filter needs some help text with example searches. How about we add the following help text to the box, either above the input, or in a popup:

You can use the scientific name filter to not only search for species names but any higher taxon.
E.g. search for records having scientific name "Puma concolor (Linnaeus, 1771)"
E.g. search for records having kingdom equal to "Plantae"
E.g. search for records having uncertain placement (kingdom is uninterpreted) "incertae sedis"
E.g. search for records having order equal to "Primates Linnaeus, 1758"
    


Author: mdoering@gbif.org
Comment: might need to have a special button for select all taxa in unknown kingdom. Even when searching for incertae sedis its not found. Or we need to update the autocomplete to show that kingdom first
Created: 2013-12-18 15:38:28.242
Updated: 2013-12-18 15:38:28.242


Author: sant
Created: 2013-12-20 18:31:29.066
Updated: 2013-12-20 18:31:29.066
        
Thanks Markus

The taxon_key=0 trick is quite useful to solve my question


    


Author: omeyn@gbif.org
Comment: workaround appears sufficient to close this issue
Created: 2014-04-30 15:04:37.164
Updated: 2014-04-30 15:04:37.164