Issue 13579

Deal with multiple authors & organizations in EML

13579
Reporter: mdoering
Assignee: kbraak
Type: Improvement
Summary: Deal with multiple authors & organizations in EML
Priority: Critical
Resolution: Fixed
Status: Closed
Created: 2013-08-12 08:00:00.917
Updated: 2016-08-29 17:10:38.362
Resolved: 2016-08-29 17:10:38.308
        
Description: The official EML allows the creators, metadataProvider and publisher to be a list of individuals or organizations. The GBIF profile only allows a single one. We see EML out there that has multiple elements, so we should either allow that in the GBIF profile too or at least deal with it somehow when parsing the EML into a registry model, for example by simply concatenating the names.

Attached an example from the LifeDesks]]>
    
Attachment eml.xml


Author: kbraak@gbif.org
Comment: Do we need to change our GBIF Metadata Profile so that the creators, metadataProvidor, and publisher can be a list of individuals or contacts? 
Created: 2013-12-12 14:28:16.532
Updated: 2013-12-12 14:28:16.532


Author: mdoering@gbif.org
Comment: we could, but we should at least be able to parse proper EML coming in with those lists into our models
Created: 2013-12-12 16:12:01.869
Updated: 2013-12-12 16:12:01.869


Author: kbraak@gbif.org
Created: 2014-05-19 12:43:39.351
Updated: 2014-05-19 12:43:39.351
        
A little background: a Dataset API object has a list of Contact. A Contact can have an is primary flag. The single creator, metadataProvider, and publisher contacts get the primary flag set.

Following this morning's discussion, we agreed that we should start parsing lists of creators, metadataProviders, and publishers. Currently we only parse a single one.

The first contact parsed gets the primary designation, and gets shown in the summary section on the dataset detail page of the portal. The remaining contacts will get added to the list of associated parties, and shown in the associated parties section of the dataset detail page. 
    


Author: kbraak@gbif.org
Created: 2015-06-23 14:24:48.764
Updated: 2015-06-23 14:24:48.764
        
Update:

GBIF Metadata Profile v1.1 now allows a list of creators, metadataProviders, and contacts.

Still outstanding, is to change the registry-metadata parser to only set the primary designation on the first creator/metadataProvider/contact encountered. Currently all contacts in these lists are set as primary and consequently get displayed in an inconsistent manner on the Portal dataset detail page which users are complaining about.

 
    


Author: peterdesmet
Created: 2015-10-16 14:31:43.414
Updated: 2015-10-16 14:32:30.116
        
Hi, add me to the list of users complaining about this. :-)

We add our resource contacts, resource creators and metadata providers in a specific order, especially now that resource creators are automatically listed (in order) as the authors in the resource citation. On the GBIF dataset page however, it's not the first resource contact, resource creator or metadata provider that is listed first. See for example http://www.gbif.org/dataset/7f9eb622-c036-44c6-8be9-5793eaa1fa1e, which should have "Koen Devos" as first person for all three roles.

Also, the different names for the roles are confusing... would try to standardize these.

Resource contact vs administrative contact
Metadata provider vs metadata author
Resource creator vs originator
    


Author: dschigel
Comment: As Kyle knows, we have observed it with a Russian publisher of http://www.gbif.org/dataset/98333cb6-6c15-4add-aa0e-b322bf1500ba. Note that roles and names are correct below the map of occurences, and they are incorrect above the map. Since many data holder in ZIN are old and very sensitive to correct display of roles, Roman Khalikov made it clear that even through there are many datasets waiting in this most important holder of zoological collections in Russia, they can't publish more using IPT before this contacts issues is solved. He is nostalgic about the DIGIR days, since their old one http://www.gbif.org/dataset/7e34ea34-f762-11e1-a439-00145eb45e9a is correct and is re indexed nicely.
Created: 2015-10-26 16:12:08.428
Updated: 2015-10-26 16:19:30.256


Author: dschigel
Comment: I would like to draw attention to this as a blocker for more datasets from Russian Academy of Sciences. Roman Khalikov, who spotted the issue, is coming to EU nodes meeting in Lisbon. Walter Berendsohn is planning a data mobilizing campaign in N and S Caucasus, and RAS partners would be key in the collaboration. One of the IAS fitness for use experts in 2016 is from RAS. Hope this can be re- openeded, considering other priorities.
Created: 2016-04-14 20:24:51.324
Updated: 2016-04-14 20:24:51.324


Author: kbraak@gbif.org
Created: 2016-08-29 17:10:38.358
Updated: 2016-08-29 17:10:38.358
        
Work completed.

GBIF now preserves the order that contacts are listed in EML.

http://www.gbif.org/dataset/98333cb6-6c15-4add-aa0e-b322bf1500ba now has the correct primary contacts listed in the summary block.

Publishers wanting to fix the order that contacts appear on the dataset page should contact GBIF Helpdesk directly in order to update the dataset by reinterpreting it from its EML.

Closing issue.