Issue 14241

Extend Occurrence (again) with properties needed by the portal

14241
Reporter: kbraak
Assignee: omeyn
Type: Improvement
Summary: Extend Occurrence (again) with properties needed by the portal
Priority: Critical
Resolution: Fixed
Status: Closed
Created: 2013-10-16 14:26:26.433
Updated: 2014-09-17 10:42:40.918
Resolved: 2014-09-17 10:42:40.883
        
Description: Every week, portal users are reporting missing fields in the occurrence detail page, like FieldNumber and typeStatus to name a couple.

The Occurrence class ( http://builds.gbif.org/view/Common/job/gbif-api/site/apidocs/org/gbif/api/model/occurrence/Occurrence.html ) currently has 67 properties, and 50 map directly to Darwin Core terms ( http://rs.tdwg.org/dwc/terms/ ). The simple Darwin Core has approximately 150 terms, so we're missing about 100 terms, assuming we want to index them all.

Even worse is that not all properties of the Occurrence class are actually persisted in HBase. Quite a few are placeholders only. Compare the actual list of HBase fields with the Occurrence properties:

  https://code.google.com/p/gbif-occurrencestore/source/browse/occurrence/trunk/occurrence-common/src/main/java/org/gbif/occurrencestore/common/model/constants/FieldName.java

  http://builds.gbif.org/view/Common/job/gbif-api/site/apidocs/org/gbif/api/model/occurrence/Occurrence.html

We should prepare ourselves to start indexing these and add them to the Occurrence model, portal and persistence layer.

At the same time, we should also align our naming with Darwin Core. For example, Darwin Core uses coordinatePrecision, but we use coordinateAccuracy. This just adds confusion. ]]>
    


Author: kbraak@gbif.org
Comment: Linking PF-1248: Regarding coodinatePrecision/coordinateUncertaintyInMeters this is information we supposedly have in our Occurrence model class as coordinateAccurracy/coordinateAccuracyInMeters but we are not storing the information in HBase (from what I can see).
Created: 2013-10-16 14:43:04.374
Updated: 2013-10-16 14:43:04.374


Author: mdoering@gbif.org
Comment: for min/max dwc terms like depth we also only seem to store a single value, not a range
Created: 2013-10-16 14:59:32.586
Updated: 2013-10-16 14:59:32.586


Author: omeyn@gbif.org
Comment: There are many jiras asking for essentially the same thing - extend occurrence to index all of dwc. That's been the plan for the last year and continues to be the plan - priority one after crawling goes live. For coord and alt/depth precision, we only store one value because that's what the original rollover did. That's reasonably easy to fix, but part of the big occurrence refactor.
Created: 2013-10-17 09:27:04.881
Updated: 2013-10-17 09:27:04.881


Author: ahahn@gbif.org
Comment: The linked absence data issue (PF-1195) is a bit different from the rest, as publishers currently do not have any standard to follow for transmitting the information. Linking it here not to lose the aspect, but will require handling on a different level, if we decide to go for it.
Created: 2013-10-22 11:56:55.159
Updated: 2013-10-22 12:00:27.825


Author: omeyn@gbif.org
Comment: the system can persist and display all of dwc, and dwca have all fields indexed since spring 2014.
Created: 2014-09-17 10:42:40.916
Updated: 2014-09-17 10:42:40.916