Uploaded image for project: 'Portal'
  1. Portal
  2. POR-3024

New NUB usages without authors (regression)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Checklistbank
    • Labels:

      Description

      The usage http://www.gbif-uat.org/species/7609276 is new, and its genus is http://www.gbif-uat.org/species/7804592 which doesn't have an author.

      The old usage http://www.gbif-uat.org/species/2490384 , has genus http://www.gbif-uat.org/species/2490383 , which does have an author. This one has been deleted.

      (There are probably others, but this one has the most — 4M — occurrences.)

        Gliffy Diagrams

        Issue Links

          Activity

          Hide
          Matthew Blissett added a comment -

          (I updated the links since this was found using a previous test NUB.)

          Show
          Matthew Blissett added a comment - (I updated the links since this was found using a previous test NUB.)
          Hide
          Markus Döring added a comment -

          The id change is clearly wrong.
          But the genus without authorship comes from CoL which does not have any author:
          http://www.gbif-uat.org/species/115784064

          Here are all usages in the UAT CLB for Cardinalis:

              id     |             dataset_key              |                      substr                       | rank  |      scientific_name       
          -----------+--------------------------------------+---------------------------------------------------+-------+----------------------------
           104076772 | 046bbc50-cae2-47ff-aa43-729fbf53f7c5 | International Plant Names Index                   | GENUS | Cardinalis Fabr.
           104076774 | 046bbc50-cae2-47ff-aa43-729fbf53f7c5 | International Plant Names Index                   | GENUS | Cardinalis Rupp.
           107867149 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera   | GENUS | Cardinalis Bonaparte, 1831
           107868809 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera   | GENUS | Cardinalis Bonaparte, 1838
           108476876 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera   | GENUS | Cardinalis Fabricius, 1759
           107880512 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera   | GENUS | Cardinalis Jarocki, 1821
           115190422 | 16c3f9cb-4b19-4553-ac8e-ebb90003aa02 | Wikipedia Species Pages - German                  | GENUS | Cardinalis
           114110809 | 3e9a9493-47e4-4dc9-a73a-00c23156b100 | Colaboraciones Americanas Sobre Aves              | GENUS | Cardinalis
           101956984 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List                                | GENUS | Cardinalis Bonaparte, 1831
           101957229 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List                                | GENUS | Cardinalis Bonaparte, 1838
           102019108 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List                                | GENUS | Cardinalis Fabricius, 1759
           116897153 | 71667154-257d-4d8e-a2a5-711aaf9b2d74 | Phthiraptera.info                                 | GENUS | Cardinalis
           115784064 | 7ddf754f-d193-4cc9-b351-99906754a03b | Catalogue of Life                                 | GENUS | Cardinalis
           100074882 | 80b4b440-eaca-4860-aadf-d0dfdd3e856e | Official Lists and Indexes of Names in Zoology    | GENUS | Cardinalis Bonaparte, 1838
           100094644 | 80b4b440-eaca-4860-aadf-d0dfdd3e856e | Official Lists and Indexes of Names in Zoology    |       | Cardinalis Jarocki, 1821
           116802460 | 88f4e35a-bdf8-4aa2-9a1b-56401d4eed15 |                                                   | GENUS | Cardinalis
           102094242 | 9ca92552-f23a-41a8-a140-01abaa31c931 | Integrated Taxonomic Information System (ITIS)    | GENUS | Cardinalis Bonaparte, 1838
           113865628 | a6c6cead-b5ce-4a4e-8cf5-1542ba708dec | Artsnavnebasen                                    | GENUS | Cardinalis
           114998727 | bd0a2b6d-69d1-4650-8bb1-829c8f92035f | Biodiversity inventories in high gear: DNA barcod | GENUS | Cardinalis
           100225628 | c696e5ee-9088-4d11-bdae-ab88daffab78 | IOC World Bird List, version 3.4                  | GENUS | Cardinalis
           115340076 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages                 | GENUS | Cardinalis
           115337597 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages                 | GENUS | Cardinalis
           113366329 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages                 | GENUS | Cardinalis Bonaparte, 1838
           113590088 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages                 |       | Cardinalis Fabr.
           116914154 | d7435f14-dfc9-4aaa-bef3-5d1ed22d65bf | Taxonomy in Flux Checklist                        | GENUS | Cardinalis
             7804592 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy                            | GENUS | Cardinalis
             2490383 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy                            | GENUS | Cardinalis Bonaparte, 1831
             3241527 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy                            | GENUS | Cardinalis Bonaparte, 1838
             3232102 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy                            | GENUS | Cardinalis Fabricius, 1759
             7650745 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy                            | GENUS | Cardinalis Jarocki, 1821
             7904806 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy                            | GENUS | Cardinalis Rupp.
           104120238 | fab88965-e69d-4491-a04d-e3198b626e52 | NCBI Taxonomy                                     | GENUS | Cardinalis
          
          Show
          Markus Döring added a comment - The id change is clearly wrong. But the genus without authorship comes from CoL which does not have any author: http://www.gbif-uat.org/species/115784064 Here are all usages in the UAT CLB for Cardinalis: id | dataset_key | substr | rank | scientific_name -----------+--------------------------------------+---------------------------------------------------+-------+---------------------------- 104076772 | 046bbc50-cae2-47ff-aa43-729fbf53f7c5 | International Plant Names Index | GENUS | Cardinalis Fabr. 104076774 | 046bbc50-cae2-47ff-aa43-729fbf53f7c5 | International Plant Names Index | GENUS | Cardinalis Rupp. 107867149 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Bonaparte, 1831 107868809 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Bonaparte, 1838 108476876 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Fabricius, 1759 107880512 | 0938172b-2086-439c-a1dd-c21cb0109ed5 | Interim Register of Marine and Nonmarine Genera | GENUS | Cardinalis Jarocki, 1821 115190422 | 16c3f9cb-4b19-4553-ac8e-ebb90003aa02 | Wikipedia Species Pages - German | GENUS | Cardinalis 114110809 | 3e9a9493-47e4-4dc9-a73a-00c23156b100 | Colaboraciones Americanas Sobre Aves | GENUS | Cardinalis 101956984 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List | GENUS | Cardinalis Bonaparte, 1831 101957229 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List | GENUS | Cardinalis Bonaparte, 1838 102019108 | 714c64e3-2dc1-4bb7-91e4-54be5af4da12 | IRMNG Homonym List | GENUS | Cardinalis Fabricius, 1759 116897153 | 71667154-257d-4d8e-a2a5-711aaf9b2d74 | Phthiraptera.info | GENUS | Cardinalis 115784064 | 7ddf754f-d193-4cc9-b351-99906754a03b | Catalogue of Life | GENUS | Cardinalis 100074882 | 80b4b440-eaca-4860-aadf-d0dfdd3e856e | Official Lists and Indexes of Names in Zoology | GENUS | Cardinalis Bonaparte, 1838 100094644 | 80b4b440-eaca-4860-aadf-d0dfdd3e856e | Official Lists and Indexes of Names in Zoology | | Cardinalis Jarocki, 1821 116802460 | 88f4e35a-bdf8-4aa2-9a1b-56401d4eed15 | | GENUS | Cardinalis 102094242 | 9ca92552-f23a-41a8-a140-01abaa31c931 | Integrated Taxonomic Information System (ITIS) | GENUS | Cardinalis Bonaparte, 1838 113865628 | a6c6cead-b5ce-4a4e-8cf5-1542ba708dec | Artsnavnebasen | GENUS | Cardinalis 114998727 | bd0a2b6d-69d1-4650-8bb1-829c8f92035f | Biodiversity inventories in high gear: DNA barcod | GENUS | Cardinalis 100225628 | c696e5ee-9088-4d11-bdae-ab88daffab78 | IOC World Bird List, version 3.4 | GENUS | Cardinalis 115340076 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | GENUS | Cardinalis 115337597 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | GENUS | Cardinalis 113366329 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | GENUS | Cardinalis Bonaparte, 1838 113590088 | cbb6498e-8927-405a-916b-576d00a6289b | English Wikipedia - Species Pages | | Cardinalis Fabr. 116914154 | d7435f14-dfc9-4aaa-bef3-5d1ed22d65bf | Taxonomy in Flux Checklist | GENUS | Cardinalis 7804592 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis 2490383 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Bonaparte, 1831 3241527 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Bonaparte, 1838 3232102 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Fabricius, 1759 7650745 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Jarocki, 1821 7904806 | d7dddbf4-2cf0-4f39-9b2a-bb099caae36c | GBIF Backbone Taxonomy | GENUS | Cardinalis Rupp. 104120238 | fab88965-e69d-4491-a04d-e3198b626e52 | NCBI Taxonomy | GENUS | Cardinalis
          Hide
          Markus Döring added a comment -

          This is unexpected. When I create a test with all those names above and build a nub in the current dataset priority order I end up with this result which looks exactly as expected:

          Animalia [kingdom]
            Cardinalidae [family]
              Cardinalis Bonaparte, 1838 [genus]
                Cardinalis cardinalis (Linnaeus, 1758) [species]
            Cardinalis Jarocki, 1821 [genus]
          Plantae [kingdom]
            Campanulaceae [family]
              Cardinalis Fabr. [genus]
              Cardinalis Rupp. [genus]
          

          Cardinalis Jarocki and Rupp. will be flagged as doubtful so only one accepted genus per kingdom remains. These should then be the ones occurrences get attached to

          Show
          Markus Döring added a comment - This is unexpected. When I create a test with all those names above and build a nub in the current dataset priority order I end up with this result which looks exactly as expected: Animalia [kingdom] Cardinalidae [family] Cardinalis Bonaparte, 1838 [genus] Cardinalis cardinalis (Linnaeus, 1758) [species] Cardinalis Jarocki, 1821 [genus] Plantae [kingdom] Campanulaceae [family] Cardinalis Fabr. [genus] Cardinalis Rupp. [genus] Cardinalis Jarocki and Rupp. will be flagged as doubtful so only one accepted genus per kingdom remains. These should then be the ones occurrences get attached to
          Hide
          Markus Döring added a comment -

          After rebuilding a new backbone we still see the issue.
          These are all Cardinalis genera in the nub, the 2 old Bonaparte ones are now deleted

           2490383 | GENUS | 2016-03-10 04:55:00.207585 | Cardinalis Bonaparte, 1831
           3232102 | GENUS |                            | Cardinalis Fabricius, 1759
           3241527 | GENUS | 2016-03-10 04:55:27.932974 | Cardinalis Bonaparte, 1838
           7370558 | GENUS |                            | Cardinalis Jarocki, 1821
           8244778 | GENUS |                            | Cardinalis
           7756296 | GENUS |                            | Cardinalis Rupp.
          
          Show
          Markus Döring added a comment - After rebuilding a new backbone we still see the issue. These are all Cardinalis genera in the nub, the 2 old Bonaparte ones are now deleted 2490383 | GENUS | 2016-03-10 04:55:00.207585 | Cardinalis Bonaparte, 1831 3232102 | GENUS | | Cardinalis Fabricius, 1759 3241527 | GENUS | 2016-03-10 04:55:27.932974 | Cardinalis Bonaparte, 1838 7370558 | GENUS | | Cardinalis Jarocki, 1821 8244778 | GENUS | | Cardinalis 7756296 | GENUS | | Cardinalis Rupp.
          Hide
          Markus Döring added a comment -

          There are 2 issues here.

          1) the ids are not stable. This is dealt with in a new jira POR-3060

          2) the genus Cardinalis is lacking an authorshi even though the sources provide them. This is kept the topic for this issue

          Show
          Markus Döring added a comment - There are 2 issues here. 1) the ids are not stable. This is dealt with in a new jira POR-3060 2) the genus Cardinalis is lacking an authorshi even though the sources provide them. This is kept the topic for this issue
          Hide
          Markus Döring added a comment - - edited

          This also happens for other homonyms, e.g. the Oenanthes:
          http://www.gbif-uat.org/species/search?q=oenanthe&dataset_key=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&rank=GENUS

          That looks pretty bad, increasing to blocker.
          Weirdly there are tests for exactly this and they do not fail:
          https://github.com/gbif/checklistbank/blob/master/checklistbank-cli/src/test/java/org/gbif/checklistbank/nub/NubBuilderIT.java#L442

          Show
          Markus Döring added a comment - - edited This also happens for other homonyms, e.g. the Oenanthes: http://www.gbif-uat.org/species/search?q=oenanthe&dataset_key=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&rank=GENUS That looks pretty bad, increasing to blocker. Weirdly there are tests for exactly this and they do not fail: https://github.com/gbif/checklistbank/blob/master/checklistbank-cli/src/test/java/org/gbif/checklistbank/nub/NubBuilderIT.java#L442
          Hide
          Markus Döring added a comment -

          neo4j and the usageDAO contains the right data with authors:

          neo4j-sh (?)$ match (n:TAXON) where n.canonicalName='Oenanthe' return n;
          +------------------------------------------------------------------------------------------+
          | Node[1575439]{scientificName:"Oenanthe Vieillot, 1816",rank:19,canonicalName:"Oenanthe"} |
          | Node[2104649]{rank:19,canonicalName:"Oenanthe",scientificName:"Oenanthe L."}             |
          | Node[4244677]{scientificName:"Oenanthe Pallas, 1771",canonicalName:"Oenanthe",rank:19}   |
          | Node[4426626]{rank:19,canonicalName:"Oenanthe",scientificName:"Oenanthe"}                |
          | Node[4527873]{rank:19,canonicalName:"Oenanthe",scientificName:"Oenanthe"}                |
          +------------------------------------------------------------------------------------------+
          
          NUB: NubUsage{usageKey=8239683, publishedIn=null, scientificNameID=null, rank=GENUS, origin=SOURCE, parsedName=Oenanthe G:Oenanthe R:gen. A:Vieillot Y:1816 [SCIENTIFIC], status=ACCEPTED, nomStatus=[], node=Node[1575439], kingdom=ANIMALIA, sourceIds=[115698472, 106343802, 102100548, 107871625, 101499267], issues=[], remarks=[], datasetKey=7ddf754f-d193-4cc9-b351-99906754a03b}
          
          USAGE: NameUsage{key=8239683, kingdom=null, phylum=null, clazz=null, order=null, family=null, genus=null, subgenus=null, species=null, kingdomKey=null, phylumKey=null, classKey=null, orderKey=null, familyKey=null, genusKey=null, subgenusKey=null, speciesKey=null, datasetKey=null, subDatasetKey=7ddf754f-d193-4cc9-b351-99906754a03b, nubKey=null, parentKey=1575041, parent=Muscicapidae, proParteKey=null, acceptedKey=null, accepted=null, basionymKey=null, basionym=null, scientificName=Oenanthe Vieillot, 1816, canonicalName=Oenanthe, vernacularName=null, authorship=null, nameType=null, taxonomicStatus=ACCEPTED, nomenclaturalStatus=[], rank=GENUS, publishedIn=null, accordingTo=null, numDescendants=0, isSynonym=false, origin=SOURCE, remarks=, references=null, taxonID=gbif:8239683, modified=null, deleted=null, lastCrawled=null, lastInterpreted=null, issues=[]}
          
          Show
          Markus Döring added a comment - neo4j and the usageDAO contains the right data with authors: neo4j-sh (?)$ match (n:TAXON) where n.canonicalName='Oenanthe' return n; +------------------------------------------------------------------------------------------+ | Node[1575439]{scientificName:"Oenanthe Vieillot, 1816",rank:19,canonicalName:"Oenanthe"} | | Node[2104649]{rank:19,canonicalName:"Oenanthe",scientificName:"Oenanthe L."} | | Node[4244677]{scientificName:"Oenanthe Pallas, 1771",canonicalName:"Oenanthe",rank:19} | | Node[4426626]{rank:19,canonicalName:"Oenanthe",scientificName:"Oenanthe"} | | Node[4527873]{rank:19,canonicalName:"Oenanthe",scientificName:"Oenanthe"} | +------------------------------------------------------------------------------------------+ NUB: NubUsage{usageKey=8239683, publishedIn=null, scientificNameID=null, rank=GENUS, origin=SOURCE, parsedName=Oenanthe G:Oenanthe R:gen. A:Vieillot Y:1816 [SCIENTIFIC], status=ACCEPTED, nomStatus=[], node=Node[1575439], kingdom=ANIMALIA, sourceIds=[115698472, 106343802, 102100548, 107871625, 101499267], issues=[], remarks=[], datasetKey=7ddf754f-d193-4cc9-b351-99906754a03b} USAGE: NameUsage{key=8239683, kingdom=null, phylum=null, clazz=null, order=null, family=null, genus=null, subgenus=null, species=null, kingdomKey=null, phylumKey=null, classKey=null, orderKey=null, familyKey=null, genusKey=null, subgenusKey=null, speciesKey=null, datasetKey=null, subDatasetKey=7ddf754f-d193-4cc9-b351-99906754a03b, nubKey=null, parentKey=1575041, parent=Muscicapidae, proParteKey=null, acceptedKey=null, accepted=null, basionymKey=null, basionym=null, scientificName=Oenanthe Vieillot, 1816, canonicalName=Oenanthe, vernacularName=null, authorship=null, nameType=null, taxonomicStatus=ACCEPTED, nomenclaturalStatus=[], rank=GENUS, publishedIn=null, accordingTo=null, numDescendants=0, isSynonym=false, origin=SOURCE, remarks=, references=null, taxonID=gbif:8239683, modified=null, deleted=null, lastCrawled=null, lastInterpreted=null, issues=[]}
          Show
          Markus Döring added a comment - https://github.com/gbif/checklistbank/commit/3e3d8cda49d962c3150072c669e5ba3291ddc8b2
          Hide
          Markus Döring added a comment -

          Added clb-admin cli method to update all wrong parsed names:
          https://github.com/gbif/checklistbank/commit/b6ce6f0eb33b52d959f737ce5130249563de9bb1

          Show
          Markus Döring added a comment - Added clb-admin cli method to update all wrong parsed names: https://github.com/gbif/checklistbank/commit/b6ce6f0eb33b52d959f737ce5130249563de9bb1

            People

            • Assignee:
              Markus Döring
              Reporter:
              Matthew Blissett
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: