Issue 10822

ROR.collector_name can contain a concatenated list of values that should be parsed-out

10822
Reporter: kbraak
Type: Bug
Summary: ROR.collector_name can contain a concatenated list of values that should be parsed-out
Priority: Major
Status: Open
Created: 2012-02-14 15:38:36.951
Updated: 2016-02-15 13:45:38.036
        
Description: ROR.collector_name corresponds to DwC's recordedBy:

"A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first."

In the current OLD processing, we grab all collector names for a dataset, and write string tags for them (unparsed). For example, on mogo I have just run:

mysql> select * from string_tag where tag_id = 4161 and entity_id = 94 order by value;

Which gives me 100 collector names (unparsed) for data resource 94.

I'm not sure how the string tags actually get used in the OLD Portal though.

Anyways, the whole point to actually parsing them properly is that collector names and identifier names might become more important in the new Portal. Collectors are something users may search for, though standardisation/clustering will be a mess ("Charles Darwin"). Identifiers (as in persons having identified a taxon) may become important in the context of data quality markers.]]>