Uploaded image for project: 'Portal'
  1. Portal
  2. POR-3081

Backbone (infra)species lacking epithets

    Details

    • Type: Bug Bug
    • Status: In Progress
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Checklistbank
    • Labels:

      Description

      There are 250 taxon names with "taxonRank=SPECIES" but which in fact belong to genus (e.g. 7348906, 7350813, 8232585, etc.)

      And 140 taxon names with "taxonRank=VARIETY|FORM" but for which intraspecific epithet is "null" (e.g. 7407832, 8181923 , 8189733, etc.)

        Gliffy Diagrams

        Issue Links

          Activity

          Hide
          Markus Döring added a comment -

          example:
          http://www.gbif.org/species/7594570
          Senecio jacobaea null proles (Bertol.) Rouy, 1903

          Source usages has been removed, likely to be CoL then

          Show
          Markus Döring added a comment - example: http://www.gbif.org/species/7594570 Senecio jacobaea null proles (Bertol.) Rouy, 1903 Source usages has been removed, likely to be CoL then
          Hide
          Markus Döring added a comment -

          This still persists in the august 2016 backbone. The lacking infraspecific epithet is having the literal "null" string value

          Show
          Markus Döring added a comment - This still persists in the august 2016 backbone. The lacking infraspecific epithet is having the literal "null" string value
          Hide
          Markus Döring added a comment - - edited

          "Chamaemy" of rank species is a current example:
          http://api.gbif.org/v1/species/7457557

          Having the same name as an IMPLICIT_NAME genus parent:
          http://api.gbif.org/v1/species/8035685

          The bad species comes from "Chamaemy Panzer, (1806-1809)" from the Official Lists and Indexes of Names in Zoology:
          http://api.gbif.org/v1/species/100082720

          Originally it says:

          elegans, Chamaemy[i]a, Panzer, (1806-1809), Fauna Ins. germ. (105): 12 (specific name of the
          type species of Chamaemyia Meigen, 1803) (Insecta, Diptera). Op. 847 ..
          

          Thats one source to fix as it is managed by us and Rod: https://github.com/gbif/iczn-lists/

          Show
          Markus Döring added a comment - - edited "Chamaemy" of rank species is a current example: http://api.gbif.org/v1/species/7457557 Having the same name as an IMPLICIT_NAME genus parent: http://api.gbif.org/v1/species/8035685 The bad species comes from "Chamaemy Panzer, (1806-1809)" from the Official Lists and Indexes of Names in Zoology: http://api.gbif.org/v1/species/100082720 Originally it says: elegans, Chamaemy[i]a, Panzer, (1806-1809), Fauna Ins. germ. (105): 12 (specific name of the type species of Chamaemyia Meigen, 1803) (Insecta, Diptera). Op. 847 .. Thats one source to fix as it is managed by us and Rod: https://github.com/gbif/iczn-lists/
          Hide
          Markus Döring added a comment -

          It is partly a problem of our name parser that fails to deal with a name like this:

          Chamaemy[i]a elegans
          

          See http://api.gbif.org/v1/species/100082720/verbatim

          Fixed by updating the ICZN lists dwca: https://github.com/gbif/iczn-lists/commit/12dd52b96ca6d3885df92b09359ebf4c51c0c812

          Show
          Markus Döring added a comment - It is partly a problem of our name parser that fails to deal with a name like this: Chamaemy[i]a elegans See http://api.gbif.org/v1/species/100082720/verbatim Fixed by updating the ICZN lists dwca: https://github.com/gbif/iczn-lists/commit/12dd52b96ca6d3885df92b09359ebf4c51c0c812
          Hide
          Markus Döring added a comment - - edited
          select u.id, u.constituent_key, u.source_taxon_key, n.genus_or_above, n.scientific_name from name_usage u join name n on u.name_fk=n.id where u.deleted is null and u.dataset_key=nubKey() and u.rank='SPECIES' and n.specific_epithet is null and n.genus_or_above is not null;
          --> yields 436 usages
          
          select u.constituent_key, count(*) from name_usage u join name n on u.name_fk=n.id where u.deleted is null and u.dataset_key=nubKey() and u.rank='SPECIES' and n.specific_epithet is null and n.genus_or_above is not null GROUP BY u.constituent_key;
          --> yields 27 constituents.
          The main ones contributing > 99%
          
           7ddf754f-d193-4cc9-b351-99906754a03b:  169
           046bbc50-cae2-47ff-aa43-729fbf53f7c5:   93
           0938172b-2086-439c-a1dd-c21cb0109ed5:   67
           de8934f4-a136-481c-a87a-b0b202b80a31:   21
           2d59e5db-57ad-41ff-97d6-11f5fb264527:   19
           d9a4eedb-e985-4456-ad46-3df8472e00e8:   13
           9ca92552-f23a-41a8-a140-01abaa31c931:   11
          
          
          Show
          Markus Döring added a comment - - edited select u.id, u.constituent_key, u.source_taxon_key, n.genus_or_above, n.scientific_name from name_usage u join name n on u.name_fk=n.id where u.deleted is null and u.dataset_key=nubKey() and u.rank='SPECIES' and n.specific_epithet is null and n.genus_or_above is not null; --> yields 436 usages select u.constituent_key, count(*) from name_usage u join name n on u.name_fk=n.id where u.deleted is null and u.dataset_key=nubKey() and u.rank='SPECIES' and n.specific_epithet is null and n.genus_or_above is not null GROUP BY u.constituent_key; --> yields 27 constituents. The main ones contributing > 99% 7ddf754f-d193-4cc9-b351-99906754a03b: 169 046bbc50-cae2-47ff-aa43-729fbf53f7c5: 93 0938172b-2086-439c-a1dd-c21cb0109ed5: 67 de8934f4-a136-481c-a87a-b0b202b80a31: 21 2d59e5db-57ad-41ff-97d6-11f5fb264527: 19 d9a4eedb-e985-4456-ad46-3df8472e00e8: 13 9ca92552-f23a-41a8-a140-01abaa31c931: 11
          Hide
          Markus Döring added a comment -

          The remaining look mostly like name parsing problems.

          The following don't parse properly and should be fixed in name parser:

          Angiopteris d'urvilleana de Vriese

          These are badly formatted source names:

          Homozygosphaera Schilleri (Kamptner) Okada & McIntyre, 1977

          The parser fails to parse it with authorship and falls back to canonical only parsing, igoring anything but the first genus name. Suggest to check the parsed name if it matches the rank during the nub build and either reject non virus species with no species epithet or just use the unparsed, full scientific name instead of the bad canonical one.

          Acer √ó hillieri Lancaster
          Agave √ó franzosinii Hort.Hanb. ex W.Wats.
          

          Badly formatted IPNI names. Again only parsed to the genus with the authors parsed flag set to false.
          The bad characters appear to represent the hybrid symbol as in Acer × hillieri. Potential addition to the name parser to understand that

          Polana (Bulbusana) vana DeLong & Freytag 1972
          Tabanus 4punctatus Fabricius, 1805
          

          Name parser failures, needs fixed in parser!

          Show
          Markus Döring added a comment - The remaining look mostly like name parsing problems. The following don't parse properly and should be fixed in name parser: Angiopteris d'urvilleana de Vriese These are badly formatted source names: Homozygosphaera Schilleri (Kamptner) Okada & McIntyre, 1977 The parser fails to parse it with authorship and falls back to canonical only parsing, igoring anything but the first genus name. Suggest to check the parsed name if it matches the rank during the nub build and either reject non virus species with no species epithet or just use the unparsed, full scientific name instead of the bad canonical one. Acer √ó hillieri Lancaster Agave √ó franzosinii Hort.Hanb. ex W.Wats. Badly formatted IPNI names. Again only parsed to the genus with the authors parsed flag set to false. The bad characters appear to represent the hybrid symbol as in Acer × hillieri. Potential addition to the name parser to understand that Polana (Bulbusana) vana DeLong & Freytag 1972 Tabanus 4punctatus Fabricius, 1805 Name parser failures, needs fixed in parser!
          Hide
          Markus Döring added a comment -

          Still true for 400 species in the january 2017 edition

          Show
          Markus Döring added a comment - Still true for 400 species in the january 2017 edition

            People

            • Assignee:
              Markus Döring
              Reporter:
              Markus Döring
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: