Issue 11841

Streamline how we handle identifiers in various places in the API

11841
Reporter: mdoering
Type: Improvement
Summary: Streamline how we handle identifiers in various places in the API
Priority: Major
Resolution: WontFix
Status: Resolved
Created: 2012-09-12 17:16:02.78
Updated: 2014-05-20 16:41:21.006
Resolved: 2014-05-01 11:41:27.857
        
Description: Especially identifiers like URLs, DOIs or LSIDs can show up in various places in our API and it takes some logic to create actionable links for them. This logic should ideally only live in one place - currently mostly in the model.common.Identifier class via the getIdentifierLink method.

Other classes that hold identifiers in our API are at least the following:

 model.common.Identifier
 model.checklistbank.Identifier
 model.registry.Identifier
 model.registry.citation.Citation]]>


Author: kbraak@gbif.org
Created: 2012-11-15 11:41:55.725
Updated: 2012-11-15 11:42:26.586
        
I took a stab at it but stopped because it didn't appear to be simplifying things, only making them more complicated.

What I tried to do was combine the common fields into a single Identifier class. The Registry version collapsed into the common one, by just adding an id field to it. ChecklistBank's Identifier could just extend the common Identifier, having one additional field called usageKey.

Problems with simplification arose mostly due to the fact that ChecklistBank's Identifier extends NameUsageComponent. That means it needs to have getters and setters for fields called: key, usageKey, and datasetKey. Calling key=id, and datasetKey=networkEntity. I made NameUsageComponent an interface, and converted id< - >key and datasetKey< - >networkEntityKey in the getters and setters, but this didn't feel  very clean. Cleanest would be to rewrite these field names in NameUsageComponent, and cascade the changes all throughout ChecklistBank.


Author: kbraak@gbif.org
Created: 2012-11-15 11:42:51.003
Updated: 2012-11-15 11:42:51.003
        
Captured from Tim's comment over Skype:

Honestly I think we would be better sticking to 3 objects and doing a proper refactor later.

public class Identifier {
  private int id; // id in an Identifier?
  private String identifier;  // identifier in an Identifier?
  private String title; // eh?
  private UUID datasetKey;  // why?
}

Basically it just looks plain old wrong in my eyes.  Something more along the lines of the following is what I would hope we can design together:

/**
 * An identifier for an entity, which may represent a globally unique identifier but cannot be guaranteed.  
 * @param  The key type for the target entity to which this identifier provides a secondary version.
* @param  Of the actual identifier.
 */
public class SecondaryIdentifier {
  private TARGET target;  // the entity being identified with a secondary identifier
  private TYPE value; // the actual secondary identifier (TYPE could be UrlIdentifier, LSID, String, Integer etc)
}

and to demonstrate a type of identifier:

public class UrlIdentifier {
  private URL url;
  private Optional label; // when present would be used in user interfacesfor example
}


Author: omeyn@gbif.org
Comment: Based on Kyle's experience this can be reopened if it ever reaches high enough priority.
Created: 2014-05-01 11:41:27.885
Updated: 2014-05-01 11:41:27.885