Over 168.000 species homonyms in the backbone for real?
18369
Reporter: mdoering
Assignee: mdoering
Type: Bug
Summary: Over 168.000 species homonyms in the backbone for real?
Description: There are over 168.000 species homonyms in the backbone, accounting for over 98 million ocurrences which seems a lot. Investigate if thats good. key list attached
Priority: Critical
Status: Open
Created: 2016-04-05 17:18:51.517
Updated: 2016-05-09 12:30:27.215
Author: rdmpage
Created: 2016-04-07 10:26:26.08
Updated: 2016-04-07 10:26:26.08
Not sure that I follow this, the "homonyms" all seem to have different authors, e.g.
1 6910 UNRANKED ? apicalis 2 {7425198,8133285} {"? apicalis Enderlein, 1920","? apicalis Szepligeti, 1906"}
Why would these be considered to be homonyms?
Author: mdoering@gbif.org
Comment: Maybe I used the term homonym badly, but I was looking into how many names exist more than once if you ignore the authorship. This is currently causing some problem for the occurrence species matching (POR-2768) and I wanted to get an idea how large the problem is. And it is quite large affecting 98 million occurrence records!
Created: 2016-04-07 10:44:48.625
Updated: 2016-04-07 10:44:48.625
Author: mdoering@gbif.org
Created: 2016-04-11 16:17:21.4
Updated: 2016-04-11 16:17:44.977
probably drastically reduced by https://github.com/gbif/checklistbank/commit/a14506422848bcba7b111b34689f7b1790f8035d
The family was required to match up which often is not the case
Author: mdoering@gbif.org
Created: 2016-04-14 14:17:56.305
Updated: 2016-04-14 14:17:56.305
In the improved nub there are 312.127 homonyms now, 210.202 for species. This includes changed sources, e.g. 32k new names coming from new Plazi sources and an updated CoL. On first glance most of them look correct in the sense that they do differ by the authorship and there is no name without authors anymore.
-----
Still lots of basionym duplications exist because they are just lonely epithets with different authors.
For example: "? acuminatus Motschulsky, 1844" and "? acuminatus Emerton, 1913"
These account for 2229 homonym canonical names
-----
By far the most seem to be genuine different names. Examples:
{noformat}
Abgrallaspis flabellata {8384876,2088164} {"Abgrallaspis flabellata (Ferris, 1938)","Abgrallaspis flabellata Davidson, 1964"}
Abgrallaspis fraxini {2088195,8163185} {"Abgrallaspis fraxini Balachowsky, 1953","Abgrallaspis fraxini (McKenzie, 1944)"}
Abgrallaspis furcillae {8271615,2088199} {"Abgrallaspis furcillae (Brain, 1918)","Abgrallaspis furcillae Balachowsky, 1956"}
Abgrallaspis ithacae {8071717,2088170} {"Abgrallaspis ithacae Davidson, 1964","Abgrallaspis ithacae (Ferris, 1938)"}
Abgrallaspis latastei {8115977,2088206} {"Abgrallaspis latastei (Cockerell, 1894)","Abgrallaspis latastei Komosinska, 1969"}
Acanthoderes curvistriata | 1152354 | SPECIES | Acanthoderes (Psapharochrus) curvistriata Tippmann, 1960
Acanthoderes curvistriata | 8028112 | SPECIES | Acanthoderes curvistriata Zajciw, 1969
Acanthoderes curvistriata | 8381622 | SPECIES | Acanthoderes (Psapharochrus) curvistriata Gilmour, 1965
Acanthoderes curvistriata | 7522158 | SPECIES | Acanthoderes curvistriata Júlio, Giorgi & Monné, 2000
{noformat}
Many of these correct homonyms come from ex or in author citations. Maybe there is something mixed up in the sources sometimes?
Examples:
{noformat}
Acacia sentis | 2979108 | SPECIES | Acacia sentis F.Muell.
Acacia sentis | 7536326 | SPECIES | Acacia sentis F. Muell. ex Benth.
Acacia sirissa | 8170030 | SPECIES | Acacia sirissa Buch.-Ham.
Acacia sirissa | 7629537 | SPECIES | Acacia sirissa Buch.-Ham. ex Wall.
Acacia sirissa | 3973437 | SPECIES | Acacia sirissa Buch.-Ham. ex Voigt
{noformat}
-----
There are only few suspicous looking names. This is truely the same one:
{noformat}
Abacetus reflexicollis | 8026980 | SPECIES | Abacetus reflexicollis (Fairmaire, 1887)
Abacetus reflexicollis | 8433053 | SPECIES | Abacetus reflexicollis (Fairmaire, 1887)
{noformat}
-----
Then there are a few that look like the author comparison didnt do a good job:
{noformat}
Abutilon anderssonianum | 7879667 | SPECIES | Abutilon anderssonianum Garcke in Andersson
Abutilon anderssonianum | 3940391 | SPECIES | Abutilon anderssonianum Garcke ex Andersson
Abutilon halophilum | 3938486 | SPECIES | Abutilon halophilum F.Muell.
Abutilon halophilum | 7930829 | SPECIES | Abutilon halophilum F. Müll.
{noformat}
-----
159 names are found in ranks above genus. With the exception of few these all have at least one genus counterpart:
{noformat}
KINGDOM | Archaea
KINGDOM | Bacteria
PHYLUM | Acanthocephala
PHYLUM | Actinobacteria
PHYLUM | Aquificae
PHYLUM | Bryozoa
PHYLUM | Cephalorhyncha
PHYLUM | Chaetognatha
PHYLUM | Chlamydiae
PHYLUM | Chrysiogenetes
PHYLUM | Ciliophora
PHYLUM | Ctenophora
PHYLUM | Deferribacteres
PHYLUM | Elusimicrobia
PHYLUM | Foraminifera
PHYLUM | Gastrotricha
PHYLUM | Gemmatimonadetes
PHYLUM | Microsporidia
PHYLUM | Mollusca
PHYLUM | Nematoda
PHYLUM | Nitrospira
PHYLUM | Onychophora
PHYLUM | Pteridophyta
PHYLUM | Rotifera
PHYLUM | Thermodesulfobacteria
PHYLUM | Thermotogae
CLASS | Acantharia
CLASS | Actinobacteria
CLASS | Amphibia
CLASS | Appendicularia
CLASS | Aquificae
CLASS | Aves
CLASS | Chlamydiae
CLASS | Chrysiogenetes
CLASS | Deferribacteres
CLASS | Dothideales
CLASS | Elusimicrobia
CLASS | Euglenida
CLASS | Gemmatimonadetes
CLASS | Lingulata
CLASS | Myxomycetes
CLASS | Nitrospira
CLASS | Oligochaeta
CLASS | Ostracoda
CLASS | Polychaeta
CLASS | Polycystina
CLASS | Reptilia
CLASS | Saccharomycetes
CLASS | Thermodesulfobacteria
CLASS | Thermotogae
ORDER | Amphipoda
ORDER | Antipatharia
ORDER | Anura
ORDER | Blattodea
ORDER | Brachypoda
ORDER | Cingulata
ORDER | Coccoideaceae
ORDER | Coleoptera
ORDER | Diplura
ORDER | Diptera
ORDER | Dothideales
ORDER | Euglenida
ORDER | Gelidiales
ORDER | Hemiptera
ORDER | Heteropoda
ORDER | Hygrophila
ORDER | Hymenoptera
ORDER | Isopoda
ORDER | Isoptera
ORDER | Laurida
ORDER | Lepidoptera
ORDER | Lobata
ORDER | Mecoptera
ORDER | Medeolariales
ORDER | Megaloptera
ORDER | Mytiloida
ORDER | Neuroptera
ORDER | Octopoda
ORDER | Odonata
ORDER | Parachela
ORDER | Pholidota
ORDER | Phragmophora
ORDER | Pilosa
ORDER | Plecoptera
ORDER | Proboscidea
ORDER | Pygophora
ORDER | Scandentia
ORDER | Scorpiones
ORDER | Silicoflagellida
ORDER | Siphonaptera
ORDER | Squamata
ORDER | Trichoptera
FAMILY | Acrolepidae
FAMILY | Aculeata
FAMILY | Admetidae
FAMILY | Agaricus familia
FAMILY | Amphibia
FAMILY | Amphipoda
FAMILY | Anura
FAMILY | Aves
FAMILY | Biporidae
FAMILY | Blattodea
FAMILY | Bryozoa
FAMILY | Calycanthaceae
FAMILY | Candelariella
FAMILY | Caryophyllaceae
FAMILY | Cerylidae
FAMILY | Chimaeridae
FAMILY | Chondrophora
FAMILY | Clavidae
FAMILY | Coccoideaceae
FAMILY | Coleoptera
FAMILY | Cononotidae
FAMILY | Convallariaceae
FAMILY | Cornirostridae
FAMILY | Ctenacaridae
FAMILY | Dasyscypha
FAMILY | Dehalococcoides
FAMILY | Delphinapteridae
FAMILY | Deximorpha
FAMILY | Dichotoma
FAMILY | Diptera
FAMILY | Ephyridae
FAMILY | Esperiopsidae
FAMILY | Gelidiales
FAMILY | Haliplectidae
FAMILY | Hemiptera
FAMILY | Hymenoptera
FAMILY | Hypochthoniidae
FAMILY | Lemnaceae
FAMILY | Lepidoptera
FAMILY | Leptodactyla
FAMILY | Macropoda
FAMILY | Magnosiopsis
FAMILY | Medeolariales
FAMILY | Mollusca
FAMILY | Nematoda
FAMILY | Nephrolepidaceae
FAMILY | Nephthyidae
FAMILY | Odonata
FAMILY | Ostracoda
FAMILY | Palaeacaridae
FAMILY | Pamphagodes
FAMILY | Pandalidae
FAMILY | Poaceae
FAMILY | Polychelidae
FAMILY | Pteridophyllaceae
FAMILY | Pteridophyta
FAMILY | Reptilia
FAMILY | Scaliolidae
FAMILY | Scorpiones
FAMILY | Stoeckeria
FAMILY | Thecaphora
FAMILY | Trichiidae
FAMILY | Tridactylites
FAMILY | Umboniidae
FAMILY | Urnatellidae
FAMILY | Vanellidae
FAMILY | Xylella
{noformat}