Issue 18369

Over 168.000 species homonyms in the backbone for real?

18369
Reporter: mdoering
Assignee: mdoering
Type: Bug
Summary: Over 168.000 species homonyms in the backbone for real?
Description: There are over 168.000 species homonyms in the backbone, accounting for over 98 million ocurrences which seems a lot. Investigate if thats good. key list attached
Priority: Critical
Status: Open
Created: 2016-04-05 17:18:51.517
Updated: 2016-05-09 12:30:27.215
Attachment homonym-ids.txt
Attachment homonyms.txt.zip


Author: rdmpage
Created: 2016-04-07 10:26:26.08
Updated: 2016-04-07 10:26:26.08
        
Not sure that I follow this, the "homonyms" all seem to have different authors, e.g.

1	6910	UNRANKED	? apicalis	2	{7425198,8133285}	{"? apicalis Enderlein, 1920","? apicalis Szepligeti, 1906"}

Why would these be considered to be homonyms?
    


Author: mdoering@gbif.org
Comment: Maybe I used the term homonym badly, but I was looking into how many names exist more than once if you ignore the authorship. This is currently causing some problem for the occurrence species matching (POR-2768) and I wanted to get an idea how large the problem is. And it is quite large affecting 98 million occurrence records!
Created: 2016-04-07 10:44:48.625
Updated: 2016-04-07 10:44:48.625


Author: mdoering@gbif.org
Created: 2016-04-11 16:17:21.4
Updated: 2016-04-11 16:17:44.977
        
probably drastically reduced by https://github.com/gbif/checklistbank/commit/a14506422848bcba7b111b34689f7b1790f8035d

The family was required to match up which often is not the case
    


Author: mdoering@gbif.org
Created: 2016-04-14 14:17:56.305
Updated: 2016-04-14 14:17:56.305
        
In the improved nub there are 312.127 homonyms now, 210.202 for species. This includes changed sources, e.g. 32k new names coming from new Plazi sources and an updated CoL. On first glance most of them look correct in the sense that they do differ by the authorship and there is no name without authors anymore.
-----
Still lots of basionym duplications exist because they are just lonely epithets with different authors.
For example: "? acuminatus Motschulsky, 1844" and "? acuminatus Emerton, 1913"

These account for 2229 homonym canonical names
-----
By far the most seem to be genuine different names. Examples:
{noformat}
Abgrallaspis flabellata {8384876,2088164}       {"Abgrallaspis flabellata (Ferris, 1938)","Abgrallaspis flabellata Davidson, 1964"}
Abgrallaspis fraxini    {2088195,8163185}       {"Abgrallaspis fraxini Balachowsky, 1953","Abgrallaspis fraxini (McKenzie, 1944)"}
Abgrallaspis furcillae  {8271615,2088199}       {"Abgrallaspis furcillae (Brain, 1918)","Abgrallaspis furcillae Balachowsky, 1956"}
Abgrallaspis ithacae    {8071717,2088170}       {"Abgrallaspis ithacae Davidson, 1964","Abgrallaspis ithacae (Ferris, 1938)"}
Abgrallaspis latastei   {8115977,2088206}       {"Abgrallaspis latastei (Cockerell, 1894)","Abgrallaspis latastei Komosinska, 1969"}

 Acanthoderes curvistriata | 1152354 | SPECIES | Acanthoderes (Psapharochrus) curvistriata Tippmann, 1960
 Acanthoderes curvistriata | 8028112 | SPECIES | Acanthoderes curvistriata Zajciw, 1969
 Acanthoderes curvistriata | 8381622 | SPECIES | Acanthoderes (Psapharochrus) curvistriata Gilmour, 1965
 Acanthoderes curvistriata | 7522158 | SPECIES | Acanthoderes curvistriata Júlio, Giorgi & Monné, 2000
{noformat}

Many of these correct homonyms come from ex or in author citations. Maybe there is something mixed up in the sources sometimes?
Examples:

{noformat}
 Acacia sentis   | 2979108 | SPECIES | Acacia sentis F.Muell.
 Acacia sentis   | 7536326 | SPECIES | Acacia sentis F. Muell. ex Benth.

 Acacia sirissa  | 8170030 | SPECIES | Acacia sirissa Buch.-Ham.
 Acacia sirissa  | 7629537 | SPECIES | Acacia sirissa Buch.-Ham. ex Wall.
 Acacia sirissa  | 3973437 | SPECIES | Acacia sirissa Buch.-Ham. ex Voigt
{noformat}

-----
There are only few suspicous looking names. This is truely the same one:
{noformat}
Abacetus reflexicollis                       | 8026980 | SPECIES | Abacetus reflexicollis (Fairmaire, 1887)
Abacetus reflexicollis                       | 8433053 | SPECIES | Abacetus reflexicollis (Fairmaire, 1887)
{noformat}

 -----
Then there are a few that look like the author comparison didnt do a good job:
{noformat}
 Abutilon anderssonianum                      | 7879667 | SPECIES | Abutilon anderssonianum Garcke in Andersson
 Abutilon anderssonianum                      | 3940391 | SPECIES | Abutilon anderssonianum Garcke ex Andersson

 Abutilon halophilum                          | 3938486 | SPECIES | Abutilon halophilum F.Muell.
 Abutilon halophilum                          | 7930829 | SPECIES | Abutilon halophilum F. Müll.
{noformat}

-----
159 names are found in ranks above genus. With the exception of few these all have at least one genus counterpart:
{noformat}
 KINGDOM | Archaea
 KINGDOM | Bacteria
 PHYLUM  | Acanthocephala
 PHYLUM  | Actinobacteria
 PHYLUM  | Aquificae
 PHYLUM  | Bryozoa
 PHYLUM  | Cephalorhyncha
 PHYLUM  | Chaetognatha
 PHYLUM  | Chlamydiae
 PHYLUM  | Chrysiogenetes
 PHYLUM  | Ciliophora
 PHYLUM  | Ctenophora
 PHYLUM  | Deferribacteres
 PHYLUM  | Elusimicrobia
 PHYLUM  | Foraminifera
 PHYLUM  | Gastrotricha
 PHYLUM  | Gemmatimonadetes
 PHYLUM  | Microsporidia
 PHYLUM  | Mollusca
 PHYLUM  | Nematoda
 PHYLUM  | Nitrospira
 PHYLUM  | Onychophora
 PHYLUM  | Pteridophyta
 PHYLUM  | Rotifera
 PHYLUM  | Thermodesulfobacteria
 PHYLUM  | Thermotogae
 CLASS   | Acantharia
 CLASS   | Actinobacteria
 CLASS   | Amphibia
 CLASS   | Appendicularia
 CLASS   | Aquificae
 CLASS   | Aves
 CLASS   | Chlamydiae
 CLASS   | Chrysiogenetes
 CLASS   | Deferribacteres
 CLASS   | Dothideales
 CLASS   | Elusimicrobia
 CLASS   | Euglenida
 CLASS   | Gemmatimonadetes
 CLASS   | Lingulata
 CLASS   | Myxomycetes
 CLASS   | Nitrospira
 CLASS   | Oligochaeta
 CLASS   | Ostracoda
 CLASS   | Polychaeta
 CLASS   | Polycystina
 CLASS   | Reptilia
 CLASS   | Saccharomycetes
 CLASS   | Thermodesulfobacteria
 CLASS   | Thermotogae
 ORDER   | Amphipoda
 ORDER   | Antipatharia
 ORDER   | Anura
 ORDER   | Blattodea
 ORDER   | Brachypoda
 ORDER   | Cingulata
 ORDER   | Coccoideaceae
 ORDER   | Coleoptera
 ORDER   | Diplura
 ORDER   | Diptera
 ORDER   | Dothideales
 ORDER   | Euglenida
 ORDER   | Gelidiales
 ORDER   | Hemiptera
 ORDER   | Heteropoda
 ORDER   | Hygrophila
 ORDER   | Hymenoptera
 ORDER   | Isopoda
 ORDER   | Isoptera
 ORDER   | Laurida
 ORDER   | Lepidoptera
 ORDER   | Lobata
 ORDER   | Mecoptera
 ORDER   | Medeolariales
 ORDER   | Megaloptera
 ORDER   | Mytiloida
 ORDER   | Neuroptera
 ORDER   | Octopoda
 ORDER   | Odonata
 ORDER   | Parachela
 ORDER   | Pholidota
 ORDER   | Phragmophora
 ORDER   | Pilosa
 ORDER   | Plecoptera
 ORDER   | Proboscidea
 ORDER   | Pygophora
 ORDER   | Scandentia
 ORDER   | Scorpiones
 ORDER   | Silicoflagellida
 ORDER   | Siphonaptera
 ORDER   | Squamata
 ORDER   | Trichoptera
 FAMILY  | Acrolepidae
 FAMILY  | Aculeata
 FAMILY  | Admetidae
 FAMILY  | Agaricus familia
 FAMILY  | Amphibia
 FAMILY  | Amphipoda
 FAMILY  | Anura
 FAMILY  | Aves
 FAMILY  | Biporidae
 FAMILY  | Blattodea
 FAMILY  | Bryozoa
 FAMILY  | Calycanthaceae
 FAMILY  | Candelariella
 FAMILY  | Caryophyllaceae
 FAMILY  | Cerylidae
 FAMILY  | Chimaeridae
 FAMILY  | Chondrophora
 FAMILY  | Clavidae
 FAMILY  | Coccoideaceae
 FAMILY  | Coleoptera
 FAMILY  | Cononotidae
 FAMILY  | Convallariaceae
 FAMILY  | Cornirostridae
 FAMILY  | Ctenacaridae
 FAMILY  | Dasyscypha
 FAMILY  | Dehalococcoides
 FAMILY  | Delphinapteridae
 FAMILY  | Deximorpha
 FAMILY  | Dichotoma
 FAMILY  | Diptera
 FAMILY  | Ephyridae
 FAMILY  | Esperiopsidae
 FAMILY  | Gelidiales
 FAMILY  | Haliplectidae
 FAMILY  | Hemiptera
 FAMILY  | Hymenoptera
 FAMILY  | Hypochthoniidae
 FAMILY  | Lemnaceae
 FAMILY  | Lepidoptera
 FAMILY  | Leptodactyla
 FAMILY  | Macropoda
 FAMILY  | Magnosiopsis
 FAMILY  | Medeolariales
 FAMILY  | Mollusca
 FAMILY  | Nematoda
 FAMILY  | Nephrolepidaceae
 FAMILY  | Nephthyidae
 FAMILY  | Odonata
 FAMILY  | Ostracoda
 FAMILY  | Palaeacaridae
 FAMILY  | Pamphagodes
 FAMILY  | Pandalidae
 FAMILY  | Poaceae
 FAMILY  | Polychelidae
 FAMILY  | Pteridophyllaceae
 FAMILY  | Pteridophyta
 FAMILY  | Reptilia
 FAMILY  | Scaliolidae
 FAMILY  | Scorpiones
 FAMILY  | Stoeckeria
 FAMILY  | Thecaphora
 FAMILY  | Trichiidae
 FAMILY  | Tridactylites
 FAMILY  | Umboniidae
 FAMILY  | Urnatellidae
 FAMILY  | Vanellidae
 FAMILY  | Xylella
{noformat}