Issue 18112

Poor NUB matching with some binomials with unspecified parts

18112
Reporter: mblissett
Type: Bug
Summary: Poor NUB matching with some binomials with unspecified parts
Priority: Major
Resolution: Fixed
Status: Closed
Created: 2015-12-28 16:46:06.071
Updated: 2016-09-08 14:17:51.472
Resolved: 2016-09-08 14:17:51.432
        
Description: Names like "Elmis sp. Lv." previously matched to the generic name ("Elmis Latreille, 1802), but are now matching to a higher taxon, usually the family.

The capital L in this example is significant, "Elmis sp. lv." matches to the genus, as does "Elmis sp.".

This also happens with names like "Lasioglossum (Dialictus)" (used to match to Lasioglossum Curtis, 1833, now matches to Dialictus Robertson, 1902.)

The last interpreted date of the affected occurrences is April 2015.

Example: http://api.gbif.org/v1/occurrence/1037462235 (has been reinterpreted).]]>
    


Author: mblissett
Created: 2015-12-29 12:43:32.125
Updated: 2015-12-29 12:43:32.125
        
According to Dmitry, names in the form "Elmis sp. Lv." don't make much sense.

There are 489 distinct names with "sp." as the rank.
    


Author: mdoering@gbif.org
Created: 2016-09-08 13:33:11.475
Updated: 2016-09-08 13:33:11.475
        
I have the suspicion sp. Lv. might stand for aquatic species.
The occurrence above is from a river monitoring and these are others I found googling:

Examples:
http://hydro.chmi.cz/isarrow/taxon.php?seq=3289306&parent_seq=3289306&lng=eng
http://hydro.chmi.cz/isarrow/taxon.php?seq=3242001&parent_seq=3242001
https://books.google.de/books?id=zaoxBQAAQBAJ&pg=PA343&lpg=PA343&dq=Elmis+sp.+Lv.&source=bl&ots=XyApLXzErF&sig=mTnur_2JUgyR0_86IeNPZsWE8_0&hl=de&sa=X&redir_esc=y#v=onepage&q=Elmis%20sp.%20Lv.&f=false
    


Author: mdoering@gbif.org
Created: 2016-09-08 14:02:03.768
Updated: 2016-09-08 14:02:03.768
        
It is definitely a pattern found in sources:
https://books.google.de/books?id=iK_wCAAAQBAJ&lpg=PA121&ots=yAN_cmHYZh&dq=species%20%22sp.%20Lv.%22&hl=de&pg=PA112#v=onepage&q=%22sp.%20Lv.%22&f=false


Here we also see sp. Ad. so it appears to be indicating the larva or adult life stage.
This makes sense as I can only find this pattern for insects. We could strip Lv. and Ad. from names during name parsing, but this is dangerous as it is likely to clash with real authors called "Lv" or "Ad". We can at least remove it for undet species like "sp. Lv."

At least IPNI is not aware of authors "Lv" or "Ad":
http://www.ipni.org/ipni/advAuthorSearch.do?find_forename=&find_surname=&find_abbreviation=Lv&find_isoCountry=&output_format=normal&back_page=authorsearch&query_type=by_query
    


Author: mdoering@gbif.org
Comment: https://github.com/gbif/name-parser/commit/ea8ad167660064ef966a9a1cc0066889ebfc23cc
Created: 2016-09-08 14:17:51.469
Updated: 2016-09-08 14:17:51.469