Issue 18242

Registry Node title not taken from Directory API

18242
Reporter: cgendreau
Assignee: cgendreau
Type: Improvement
Summary: Registry Node title not taken from Directory API
Priority: Unassessed
Resolution: Fixed
Status: Closed
Created: 2016-02-18 14:59:37.136
Updated: 2016-06-03 16:53:27.405
Resolved: 2016-06-03 16:53:27.357
        
Description: The "Node" returned by the Registry doesn't include the "title from the Directory API but its own copy from its database. That requires to be maintained manually in the Registry or it becomes out-of-sync with the Directory.

Unless there is a valid reason (which should be documented here), we should probably always take the Node title from the Directory API.

Note:
The Registry is taking the data from the Directory API with the DirectoryAugmenterImpl class that uses the directory-ws-client.]]>
    


Author: mdoering@gbif.org
Created: 2016-02-18 15:03:30.891
Updated: 2016-02-18 15:03:30.891
        
This is also true for other fields, most importantly the status.
I would second a single source of truth a lot, but if we chose to ignore the registry values and not maintain them we can as well remove them - which was deemed important for ad hoc queries and reporting.

So unless we are prepared to merge the 2 projects - or at least databases - we should think about a way to keep them both in sync without manual work. An hourly cron job for example.
    


Author: cgendreau
Comment: In the mean time, I think the idea was to use a foreign-data wrapper (http://www.postgresql.org/docs/9.3/static/postgres-fdw.html) to allow queries and reporting.
Created: 2016-02-19 13:28:46.198
Updated: 2016-02-19 13:28:46.198


Author: kbraak@gbif.org
Created: 2016-02-19 17:23:15.969
Updated: 2016-02-19 17:23:15.969
        
Thanks for submitting this issue Christian.

I'm cc'ing [~anmnielsen] to ask if she knows any reason why our Registry doesn't sync the Node title from the Directory/IMS. For example, this inconsistency meant that the Directory/IMS showed the correct title "Belgian Biodiversity Platform", whereas our Registry was out-of-date with the old name "Belgian Biodiversity Information Facility".

And then there is the problem of updating the publisher entries in our Registry. For example, BeBIF Provider is equal to the Belgian Biodiversity Platform. Sadly, it was also out of date with the wrong contact information for the Node manager. I had to manually go in and change this yesterday.

I learned that once a year, Marlene & [~anmnielsen] contact Nodes to get their updated information. Unfortunately while that information is updated in the Directory/IMS for Nodes/Participants, the updates don't trickle down into our Registry.

An improvement would be to ask Nodes to also review their list of registered publishers, and report updates just like they do for Nodes/Participant entries.

You don't have to dig far before discovering some serious inconsistencies between the Directory/IMS and our Registry. An improved workflow is seriously needed to maintain the information up-to-date. Thanks.
    


Author: anmnielsen
Created: 2016-02-23 16:28:05.997
Updated: 2016-02-23 16:28:05.997
        
Hi [~kbraak@gbif.org] - I have no idea why the Registry does not sync the Node title from the Directory. It is very unfortunate, and I understand that [~cgendreau] is looking into possible ways to avoid the duplicate systems. If this is not possible, we should see if we can get some sort of alert system so that it is easy to see what has been changed/added to the Directory.
For the time being I will ask the relevant Admin people to report any changes to the Nodes information in the Registry to you.
I will also ask Marlene to add a request for the nodes to review the endorsed data publisher list when she contacts the nodes.
    


Author: cgendreau
Comment: Let me check with [~jlegind@gbif.org], we have other solutions.
Created: 2016-02-23 17:39:01.412
Updated: 2016-02-23 17:39:01.412


Author: cgendreau
Created: 2016-04-15 15:56:45.82
Updated: 2016-04-15 15:56:45.82
        
There is a new CLI target in the registry-cli called "directory-update" since registry 2.50 for that purpose.

Each night, changes made in the Directory Node and Participant are applied to the Node in the registry. This is only applied to modification and creation of entities. No entities will be deleted (or flagged as deleted) by this process.

This process will also remove acronyms from the titles in the registry. This can have an impact on the search. The acronyms should be available in the 'node' database of the registry.