Issue 18449

Simplify list of contact roles shown on GBIF.org

18449
Reporter: kbraak
Assignee: bko
Type: Improvement
Summary: Simplify list of contact roles shown on GBIF.org
Priority: Unassessed
Status: Open
Created: 2016-04-29 13:54:05.594
Updated: 2016-08-29 17:04:16.409
        
Description: The list of contact roles available to choose from in the IPT are listed here:

http://rs.gbif.org/vocabulary/gbif/agent_role.xml

The list of contact roles shown on the GBIF Portal are listed here:

https://github.com/gbif/gbif-api/blob/master/src/main/java/org/gbif/api/vocabulary/ContactType.java#L33

When the EML document gets parsed by the GBIF Registry, parsing rules determine what IPT contact role gets converted into what GBIF API contact role:

https://github.com/gbif/registry/blob/master/registry-metadata/src/main/java/org/gbif/registry/metadata/parse/EMLRuleSet.java#L133

As can be seen here, IPT "creators" get converted into "ORIGINATOR",

https://github.com/gbif/registry/blob/master/registry-metadata/src/main/java/org/gbif/registry/metadata/parse/DatasetWrapper.java#L259

IPT "contacts" get converted into "ADMINISTRATIVE_POINT_OF_CONTACT",

https://github.com/gbif/registry/blob/master/registry-metadata/src/main/java/org/gbif/registry/metadata/parse/DatasetWrapper.java#L232

IPT "metadata providers" get converted into "METADATA_AUTHOR"

https://github.com/gbif/registry/blob/master/registry-metadata/src/main/java/org/gbif/registry/metadata/parse/DatasetWrapper.java#L245

 All other contacts get converted according to a lookup by their name in the following class:

https://github.com/gbif/registry/blob/master/registry-metadata/src/main/java/org/gbif/registry/metadata/parse/converter/ContactTypeConverter.java

In the GBIF API response, the Dataset object has a list of Contacts organised by type:

https://github.com/gbif/gbif-api/blob/master/src/main/java/org/gbif/api/model/registry/Dataset.java#L109

There is an outstanding JIRA (POR-2560) to parse lists of contacts, creators, metadataProviders, project personnel - currently we only parse a single creator, contact, and metadataProvider from EML. The ability to enter multiple creators, contacts, and metadataProviders is a change in GBIF Metadata Schema version 1.1 (latest), different from version 1.0.2 (supported by GBIF API).]]>
    
Attachment Contact role mappings.xlsx
Attachment Roles14July.xlsx


Author: kbraak@gbif.org
Comment: [~dschigel] as requested, here is information about the contact roles the IPT uses, and the GBIF Portal displays, with an explanation of how roles get converted (see description above). 
Created: 2016-04-29 13:55:38.897
Updated: 2016-04-29 13:55:38.897


Author: kbraak@gbif.org
Created: 2016-06-15 14:10:59.336
Updated: 2016-06-15 14:10:59.336
        
[~dschigel] please review the attached spreadsheet showing my suggested simplifications for contact roles shown in the portal. These include:

1. Replacing role ADMINISTRATIVE_POINT_OF_CONTACT with POINT_OF_CONTACT. Maintaining both types is confusing to users.
2. Keeping primary ORIGINATORs and non-primary ORIGINATORs separate. It is important to maintain the list of primary ORIGINATORs (aka creators) because these correspond to the citation's creator list.
3. Adding missing role CURATOR to GBIF API, currently available to choose from in the IPT.

In my opinion IPT users select each role carefully and we should preserve as many distinct roles as possible. We need to do in the portal, is provide users with a definition of each contact role just like the IPT does to help clarify what exactly they mean.






    


Author: dschigel
Comment: I checked the roles table and I agree with changes in pink and the remarks. But the N roles is very high and therefore may be confusing for some publishers and for many users. Are there many users that need that fine level of roles grains? We could possibly keep the fine granularity of roles at the source webpages (example: http://danbif.au.dk/ipt/resource?r=rooftop) uncollapsed, but we merge & skip for the web representation at GBIF.org leaving only top 4 (top 5, top 7). Those should satisfy key credit and citation needs, but the rest is details. Which and how many to keep for the web depends on distribution of actual use across datasets: let's let the facts decide, use stats requested from Jan.
Created: 2016-06-15 14:58:56.349
Updated: 2016-06-15 14:58:56.349


Author: cgendreau
Created: 2016-07-01 13:30:53.34
Updated: 2016-07-01 13:30:53.34
        
Contact types + counts from the Registry database

 --empty--                               |  9358
 CUSTODIAN_STEWARD               |   375
 ORIGINATOR                      | 22623
 TECHNICAL_POINT_OF_CONTACT      |  2253
 ADMINISTRATIVE_POINT_OF_CONTACT | 26096
 PRINCIPAL_INVESTIGATOR          |   208
 PROCESSOR                       |   132
 PUBLISHER                       | 19237
 EDITOR                          |   302
 OWNER                           |    48
 DISTRIBUTOR                     | 17672
 DATA_ADMINISTRATOR              |    31
 CONTENT_PROVIDER                |   681
 AUTHOR                          |  1374
 POINT_OF_CONTACT                |  1973
 METADATA_AUTHOR                 | 20644
 USER                            |   580
 PROGRAMMER                      |   580
    


Author: bko@gbif.org
Created: 2016-08-04 11:08:52.78
Updated: 2016-08-04 11:08:52.78
        
Instead of letting it sink in my inbox, the current suggested roles and order for the heading of the dataset detail page attached (Roles14July).xlsx

As you can see, I'll put the top 5 in the heading in the next revision.

The rest of the roles should still show in the contacts section.
    


Author: kbraak@gbif.org
Created: 2016-08-29 17:04:16.409
Updated: 2016-08-29 17:04:16.409
        
Below is a list of changes that have been made with respect to the way we handle contacts, following the implementation of POR-2563 and POR-523:

1. We can now handle multiple contacts/ADMINISTRATIVE_POINT_OF_CONTACT, creators/ORIGINATORS, metadataProviders/METADATA_AUTHORS and project personnel supplied in EML. Previously we only handled a single contact of each.
2. We now preserve contact order, meaning that our API returns contacts in the same order they are listed in EML.
3. The missing role CURATOR was added to GBIF API.

Please note that there is a constraint in the registry database that only allows one primary contact per role. We decided to uphold this constraint, and this prevented me from being able to maintain lists of primary and secondary contacts per role (e.g. for the purpose of building the citation from the primary creators/ORIGINATORs). This also meant that I couldn't merge ADMINISTRATIVE_POINT_OF_CONTACT and POINT_OF_CONTACT into one role. Therefore primary contacts will still be assigned the role ADMINISTRATIVE_POINT_OF_CONTACT while associated parties with role "pointOfContact" will still be assigned the role POINT_OF_CONTACT.

[~hoefft], [~bko@gbif.org] it is important to be mindful of all the above when developing the reengineered dataset page.