Issue 13471

EML meta not acknowledging all publishers. Occ...

13471
Reporter: jlegind
Type: Improvement
Summary: EML meta not acknowledging all publishers.     Occ...
Priority: Critical
Status: Open
Created: 2013-07-04 10:32:42.203
Updated: 2014-05-15 15:58:08.589
        
        
Description: EML meta not acknowledging all publishers.

Occurrence search for the snow mole in Germany < 1950 returned 7 records from two publishers:
-Field Museum
-James R. Slater Museum of Natural History

The EML metadata file only lists Field Museum as associated party!
http://api.gbif.org/occurrence/download/0000139-130617162047391.zip

The JR Slater dataset 'Terrestrial vertebrates' is acknowledged in the abstract section though.]]>
    
Attachment 0000094-130819161304856.zip


Author: jlegind@gbif.org
Comment: Sorry, I meant the snow vole (not mole)
Created: 2013-07-04 22:04:58.784
Updated: 2013-07-04 22:04:58.784


Author: kbraak@gbif.org
Created: 2013-08-28 15:37:58.421
Updated: 2013-08-28 15:37:58.421
        
Update from [~jlegind@gbif.org]

Here is the occurrence search : http://uat.gbif.org/occurrence/search?TAXON_KEY=chionomys&COUNTRY=DE#
Archive is attached: Look at the EML file – Of the four publishers, only two are credited. They are all represented in the citations file and in the /dataset folder.
    


Author: kbraak@gbif.org
Created: 2013-08-28 15:54:11.479
Updated: 2013-08-28 15:54:11.479
        
Here's my analysis of your latest example:

2/4 datasets come from an IPT, and therefore have an associated eml.xml metadata file. The other 2/4 datasets come from DiGIR & BioCASE, and don't have any eml associated.

In the downloaded Archive's EML, the 2 publishers credited as associated parties (with role CONTENT_PROVIDER), appear to be taken from the eml.xml creator element.

[~mdoering@gbif.org] we need to decide:

a) Should we add the primary administrative contacts from DiGIR/BioCASE/TAPIR and add them to the EML included in the downloaded Archive (with role ADMINISTRATIVE_POINT_OF_CONTACT). This would solve the problem of those publishers not being represented in the EML file.
b) Is the eml.Creator is best represented as CONTENT_PROVIDER, or as ORIGINATOR or OWNER for example. Not a big deal I suppose.

    


Author: mdoering@gbif.org
Created: 2013-08-28 17:31:48.042
Updated: 2013-08-28 17:31:48.042
        
I did not look at the code, but I would probably setup a priority of contact types and pick the first contact in that order for every dataset. When the metasync does its job it should not matter what kind of endpoint the dataset has, all contacts should be accessible via a simple get dataset by primary key call, no? What about this order:

ADMINISTRATIVE_POINT_OF_CONTACT
OWNER
CREATOR
ORIGINATOR
...
    


Author: omeyn@gbif.org
Comment: Status update [~kbraak@gbif.org] or [~mdoering@gbif.org] ? This looks like it could cause trouble.
Created: 2013-09-26 21:46:49.451
Updated: 2013-09-26 21:46:49.451


Author: kbraak@gbif.org
Comment: I believe the solution Markus proposes should be implemented. I consider it critical, but not a blocker. 
Created: 2013-09-27 09:53:21.043
Updated: 2013-09-27 09:53:21.043