18207
Reporter: mblissett
Assignee: mdoering
Type: Bug
Summary: New NUB usages without authors (regression)
Priority: Blocker
Resolution: Fixed
Status: Closed
Created: 2016-02-04 12:53:25.753
Updated: 2016-03-21 16:28:07.074
Resolved: 2016-03-21 15:56:43.636
Description: The usage http://www.gbif-uat.org/species/7609276 is new, and its genus is http://www.gbif-uat.org/species/7804592 which doesn't have an author.
The old usage http://www.gbif-uat.org/species/2490384 , has genus http://www.gbif-uat.org/species/2490383 , which does have an author. This one has been deleted.
(There are probably others, but this one has the most — 4M — occurrences.)]]>
Author: mblissett
Created: 2016-03-04 14:47:51.669
Updated: 2016-03-04 14:47:51.669
(I updated the links since this was found using a previous test NUB.)
Author: mdoering@gbif.org
Created: 2016-03-04 16:17:44.476
Updated: 2016-03-04 16:17:44.476
The id change is clearly wrong.
But the genus without authorship comes from CoL which does not have any author:
http://www.gbif-uat.org/species/115784064
Here are all usages in the UAT CLB for Cardinalis:
{noformat}
id | dataset_key | substr | rank | scientific_name
-----------+--------------------------------------+---------------------------------------------------+-------+----------------------------
104076772 | 046bbc50-cae2-47ff-aa43-729fbf53f7c5 | International Plant Names Index | GENUS | Cardinalis Fabr.
104076774 | 046bbc50-cae2-47ff-aa43-729fbf53f7c5 | International Plant Names Index | GENUS | Cardinalis Rupp.
107867149 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Bonaparte, 1831
107868809 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Bonaparte, 1838
108476876 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Fabricius, 1759
107880512 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Jarocki, 1821
115190422 | 16c3f9cb-4b19-4553-ac8e-ebb90003aa02 | Wikipedia Species Pages - German | GENUS | Cardinalis
114110809 | 3e9a9493-47e4-4dc9-a73a-00c23156b100 | Colaboraciones Americanas Sobre Aves | GENUS | Cardinalis
101956984 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List | GENUS | Cardinalis Bonaparte, 1831
101957229 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List | GENUS | Cardinalis Bonaparte, 1838
102019108 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List | GENUS | Cardinalis Fabricius, 1759
116897153 | 71667154-257d-4d8e-a2a5-711aaf9b2d74 | Phthiraptera.info | GENUS | Cardinalis
115784064 | 7ddf754f-d193-4cc9-b351-99906754a03b | Catalogue of Life | GENUS | Cardinalis
100074882 | 80b4b440-eaca-4860-aadf-d0dfdd3e856e | Official Lists and Indexes of Names in Zoology | GENUS | Cardinalis Bonaparte, 1838
100094644 | 80b4b440-eaca-4860-aadf-d0dfdd3e856e | Official Lists and Indexes of Names in Zoology | | Cardinalis Jarocki, 1821
116802460 | 88f4e35a-bdf8-4aa2-9a1b-56401d4eed15 | | GENUS | Cardinalis
102094242 | 9ca92552-f23a-41a8-a140-01abaa31c931 | Integrated Taxonomic Information System (ITIS) | GENUS | Cardinalis Bonaparte, 1838
113865628 | a6c6cead-b5ce-4a4e-8cf5-1542ba708dec | Artsnavnebasen | GENUS | Cardinalis
114998727 | bd0a2b6d-69d1-4650-8bb1-829c8f92035f | Biodiversity inventories in high gear: DNA barcod | GENUS | Cardinalis
100225628 | c696e5ee-9088-4d11-bdae-ab88daffab78 | IOC World Bird List, version 3.4 | GENUS | Cardinalis
115340076 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | GENUS | Cardinalis
115337597 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | GENUS | Cardinalis
113366329 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | GENUS | Cardinalis Bonaparte, 1838
113590088 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | | Cardinalis Fabr.
116914154 | d7435f14-dfc9-4aaa-bef3-5d1ed22d65bf | Taxonomy in Flux Checklist | GENUS | Cardinalis
7804592 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis
2490383 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Bonaparte, 1831
3241527 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Bonaparte, 1838
3232102 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Fabricius, 1759
7650745 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Jarocki, 1821
7904806 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Rupp.
104120238 | fab88965-e69d-4491-a04d-e3198b626e52 | NCBI Taxonomy | GENUS | Cardinalis
{noformat}
Author: mdoering@gbif.org
Created: 2016-03-04 17:27:15.045
Updated: 2016-03-04 17:27:15.045
This is unexpected. When I create a test with all those names above and build a nub in the current dataset priority order I end up with this result which looks exactly as expected:
{noformat}
Animalia [kingdom]
Cardinalidae [family]
Cardinalis Bonaparte, 1838 [genus]
Cardinalis cardinalis (Linnaeus, 1758) [species]
Cardinalis Jarocki, 1821 [genus]
Plantae [kingdom]
Campanulaceae [family]
Cardinalis Fabr. [genus]
Cardinalis Rupp. [genus]
{noformat}
Cardinalis Jarocki and Rupp. will be flagged as doubtful so only one accepted genus per kingdom remains. These should then be the ones occurrences get attached to
Author: mdoering@gbif.org
Created: 2016-03-10 14:13:25.361
Updated: 2016-03-10 14:13:25.361
After rebuilding a new backbone we still see the issue.
These are all Cardinalis genera in the nub, the 2 old Bonaparte ones are now deleted
{noformat} 2490383 | GENUS | 2016-03-10 04:55:00.207585 | Cardinalis Bonaparte, 1831
3232102 | GENUS | | Cardinalis Fabricius, 1759
3241527 | GENUS | 2016-03-10 04:55:27.932974 | Cardinalis Bonaparte, 1838
7370558 | GENUS | | Cardinalis Jarocki, 1821
8244778 | GENUS | | Cardinalis
7756296 | GENUS | | Cardinalis Rupp.
{noformat}
Author: mdoering@gbif.org
Created: 2016-03-10 20:09:49.704
Updated: 2016-03-10 20:09:49.704
There are 2 issues here.
1) the ids are not stable. This is dealt with in a new jira POR-3060
2) the genus Cardinalis is lacking an authorshi even though the sources provide them. This is kept the topic for this issue
Author: mdoering@gbif.org
Created: 2016-03-21 12:57:34.562
Updated: 2016-03-21 12:59:11.461
This also happens for other homonyms, e.g. the Oenanthes:
http://www.gbif-uat.org/species/search?q=oenanthe&dataset_key=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&rank=GENUS
That looks pretty bad, increasing to blocker.
Weirdly there are tests for exactly this and they do not fail:
https://github.com/gbif/checklistbank/blob/master/checklistbank-cli/src/test/java/org/gbif/checklistbank/nub/NubBuilderIT.java#L442