Issue 11546

BioVeL: Use API to clean a list of names and tie them to one or more reference taxonomies

11546
Reporter: mdoering
Type: Story
Summary: BioVeL: Use API to clean a list of names and tie them to one or more reference taxonomies
Priority: Major
Status: Open
Created: 2012-07-05 16:43:50.237
Updated: 2013-12-09 13:53:08.498
        
Description: Basic use case for BioVeL using the GBIF web services as discussed with Yde de Yong:

Background:
BioVeL uses workflow tools such as Taverna to pipe data through various web services to do scientific analysis.
Webservice formats can be converted into another by having converter services. JSON is readily supported media type.

Outline:
A list of plain potentially misspelled names with optional classification should be checked and corrected according to a reference checklist available already in Checklist Bank. The returned list should contain the original name, its id and additionally the "correct" name, status and classification given in the reference taxonomies.

Current Solution
0 - client determines the datasetKeys of the reference taxonomies manually via website
1 - client iterates over the list of names and calls for each name:
1.1 - call the nub lookup service with the name and an optional classification (recommended to disambiguate homonyms and do safer fuzzy matching) to retrieve the matching nubUsageKey
1.2 - if nubUsageKey exists, call /name_usage/${nubUsageKey}/related?checklistKey=${checklistKey} to find the usageKey in the reference taxonomy
1.3 - look up the reference usage name, status and classification by calling /name_usage/${usageKey}


Proposed NameList Solution
0 - client determines the datasetKeys of the reference taxonomies manually via website
1 - client uploads a list of names with an optional classification (recommended to disambiguate homonyms and do safer fuzzy matching)
2 - client issues a download for that list with the given list of reference datasetKeys to be used to add the extra information.
]]>