Issue 16656

Errors in taxon authorship for 1,798 beetle names

16656
Reporter: rdmpage
Type: Bug
Summary: Errors in taxon authorship for 1,798 beetle names
Priority: Critical
Status: ToDo
Created: 2014-11-19 17:23:24.367
Updated: 2016-08-02 11:02:42.952
        
Description: This is an example of an issue that affects a number of names in GBIF, but the fault lies with the data providers. I'm giving this example because it came up in discussion with Ken Walker as one of his all-time favourite annoyances.

Searching GBIF for "Wood & Bright 1992" http://www.gbif.org/species/search?q=Wood+%26+Bright+1992&dataset_key=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c yields 1,798 names all with the authorship "Wood & Bright 1992", imp;lying that those people are the author of those names. They aren't, they simply wrote a catalogue in 1982 (see http://biostor.org/reference/143867 ) in which they listed various names, but they did not describe ANY new species. It looks like this error comes from Catalogue of Life, which got it in turn from WTaxa.

As an example, the first hit is "Pseudocryphalus sidneyanus Wood & Bright, 1992" http://www.gbif.org/species/5021562. After some Googling this looks to be also known as "Cryphalus sidneyanus" http://www.gbif.org/species/1212094 and was originally described as "Bostrichus sidneyanus" by Nordlinger in 1856 http://www.gbif.org/species/5022125 GBIF has all three names as distinct taxa, when they are objective synonyms. The original description is in Google Books http://books.google.co.uk/books?id=N5RRAAAAcAAJ&pg=PA75#v=onepage&q=sidneyanus&f=false, and the move to Cryphalus is in BioStor http://biostor.org/reference/144103 (also http://dx.doi.org/10.1002/mmnd.18680120209 ).

So, a mess that will take some sorting out. Meantime, any name with the authorship "Wood & Bright, 1992" should be treated with caution - they are not the authors, and it likely that a bunch of synonyms are being obscured because of this.

More generally, I suspect that this happens a LOT, and reflects the sometimes crappy data GBIF is getting from CoL. If I get the chance I'll document problems with butterfly names in another issue.

]]>
    


Author: mdoering@gbif.org
Comment: Thanks Rod, this highlights once again the bottlenecks we face with all the closed taxonomic systems. If one could just go to CoL and fix the error... I will think about a better way of applying fixes in GBIF at least. The current thinking is still that we will have a minimal github based taxon CSV file (dwca compliant) that we can all together manage that overrides any of the other backbone building checklists like the Catalog of Life. Instead of a single file maybe even a folder with lots of small separate patch files...
Created: 2014-11-20 13:49:18.187
Updated: 2014-11-20 13:49:18.187


Author: rdmpage
Comment: Hi Markus, The GitHub idea makes sense. Having a series of "patch files" might make it attractive to others to take part. For example, I'm about to add an automated synonym finder to BioNames that discovers possible synonyms based on having names with the same species epithet on the same page (i.e., it attempts to find synonym lists in the primary literature). I could imagine building synonym lists and making them available as DWCA files. I could also imagine people creating lists of taxon names, publishing them in, say, Biodiversity Data Journal so they get a publication, then contributing the DWCA file to the GitHub repository. If this seems a sensible strategy, maybe it would be useful to thrash out the details and advertise the process. Do you envisage something like https://github.com/jhpoelen/eol-globi-data where people raise issues saying that they have data, and GLOBI then add the data?
Created: 2014-11-20 17:12:58.45
Updated: 2014-11-20 17:12:58.45


Author: mdoering@gbif.org
Created: 2016-08-02 11:02:06.765
Updated: 2016-08-02 11:02:06.765
        
The problem still persists in the latest august 2016 backbone with 2011 names having the authorship Wood & Bright, 1992:
http://www.gbif-uat.org/species/search?q=Wood+%26+Bright%2C+1992&dataset_key=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c

I have reported the issue to CoL, maybe they can get on top of this