Issue 11650

Extend Occurrence with properties needed by the portal

11650
Reporter: mdoering
Assignee: mdoering
Type: NewFeature
Summary: Extend Occurrence with properties needed by the portal
Priority: Critical
Resolution: Fixed
Status: Closed
Created: 2012-08-03 11:49:23.85
Updated: 2013-12-17 15:16:57.34
Resolved: 2012-08-21 14:45:33.737
        
Description: The occurrence detail page on the portal requires more data than the current Occurrence model class contains.
http://staging.gbif.org:8080/portal-web-dynamic/occurrence/1000000

The following properties need to be added in descending order of priority:
- POSITION: string for locality given as dwc:locality
- CONTINENT: string
- STATE/PROVINCE: string
- COUNTY: string
- TYPE STATUS: enum missing from common-api, see http://rs.gbif.org/vocabulary/gbif/type_status.xml
- COLLECTOR NAME: String, see http://rs.tdwg.org/dwc/terms/index.htm#recordedBy
- COLLECTOR NOTES: String, see http://rs.tdwg.org/dwc/terms/index.htm#fieldNotes & http://rs.tdwg.org/dwc/terms/index.htm#eventRemarks
- USAGE RIGHTS: String, see http://rs.tdwg.org/dwc/terms/index.htm#dcterms:rights
- HOW TO CITE IT: String, if existing http://rs.tdwg.org/dwc/terms/index.htm#dcterms:bibliographicCitation, otherwise our own citation string
- INDIVIDUAL COUNT: int, see http://rs.tdwg.org/dwc/terms/index.htm#individualCount
- GEOPRECISION: int in meters, see http://rs.tdwg.org/dwc/terms/index.htm#coordinateUncertaintyInMeters and http://rs.tdwg.org/dwc/terms/index.htm#coordinatePrecision
- IDENTIFIER NAME: String, see http://rs.tdwg.org/dwc/terms/index.htm#identifiedBy
- IDENTIFICATION DATE: date, see http://rs.tdwg.org/dwc/terms/index.htm#dateIdentified
- OCCURRENCE NOTES: String, http://rs.tdwg.org/dwc/terms/index.htm#occurrenceRemarks & http://rs.tdwg.org/dwc/terms/index.htm#fieldNotes
- IDENTIFICATION NOTES: String, http://rs.tdwg.org/dwc/terms/index.htm#identificationRemarks
- IDENTIFICATION QUALIFIER: String, http://rs.tdwg.org/dwc/terms/index.htm#identificationQualifier
- IDENTIFICATION REFERENCES: String, http://rs.tdwg.org/dwc/terms/index.htm#identificationReferences
- HABITAT: String, http://rs.tdwg.org/dwc/terms/index.htm#habitat
- ESTABLISHMENT MEANS: String, http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans
- REPRODUCTIVE CONDITION: String, http://rs.tdwg.org/dwc/terms/index.htm#reproductiveCondition
- BEHAVIOR: String, http://rs.tdwg.org/dwc/terms/index.htm#behavior
- PREPARATIONS: String, http://rs.tdwg.org/dwc/terms/index.htm#preparations
- DISPOSITION: String, http://rs.tdwg.org/dwc/terms/index.htm#disposition
- LIFE STAGE: enum, see http://staging.gbif.org:8080/enum-web/enum?id=org.gbif.checklistbank.api.model.vocabulary.LifeStage
- SEX: enum, see http://staging.gbif.org:8080/enum-web/enum?id=org.gbif.checklistbank.api.model.vocabulary.Sex

Many of these properties have not been indexed in the old mysql data portal, so we don't have any data available yet.
Nevertheless we should prepare ourselves to start indexing these and add them to the Occurrence model, portal and persistence layer.]]>
    


Author: mdoering@gbif.org
Comment: identifications should be multiple at some stage if that needs to be taken into account already
Created: 2012-08-03 11:50:58.544
Updated: 2012-08-03 11:50:58.544


Author: mdoering@gbif.org
Created: 2012-08-09 11:16:09.128
Updated: 2012-08-09 11:16:09.128
        
collector_name, locality, type_status, identifier_name, identification_date exist in the mysql occurrence tables (both raw & interpreted) already and should be rather quick additions

    


Author: trobertson@gbif.org
Created: 2012-08-09 13:07:56.451
Updated: 2012-08-09 13:07:56.451
        
Typification record handling in the portal has been terrible (my fault).

Please *don't* simply copy the existing implementation, but rather rethink what we are trying to do (query for records that were designated types) and redesign both the interpretation and storage.  Markus might be the best person to do this, as it requires a little domain knowledge about the vocabularies people use, and also the scenarios that result in a type specimen being known by different names over time.  Please bear in mind that the record might have been designated a typification record under a different name to that on the current scientific name.

I could imagine one sensible solution would be a multi-value field that holds the typestatus + scientificName versions that the specimen has been used for.  This could be stored for example as an avro serialized field, but will we be able to query for that in downloads?
    


Author: mdoering@gbif.org
Created: 2012-08-09 13:19:50.068
Updated: 2012-08-09 13:19:50.068
        
Right Tim. I will try to prepare the Occurrence model class for types and refresh my mind how ABCD, XMl DwC and dwca keeps that information.
It will probably be close to http://rs.gbif.org/extension/gbif/1.0/typesandspecimen.xml but a bit simpler.
Also see the BGBM CDM typification model: http://wp5.e-taxonomy.eu/cdm-uml/latest/index.htm?goto=7:211 (gotta resize the left bar to fix the js bug)
For occurrences we only ever have a SpecimenTypeDesignation (NameTypeDesignations are for higher taxa and don't involve specimens, for example a genus name is typed by some species name).
    


Author: mdoering@gbif.org
Comment: We also need to add the missing spatial String attributes for Continent, State/Province and County
Created: 2012-08-20 16:17:44.025
Updated: 2012-08-20 16:17:44.025


Author: mdoering@gbif.org
Created: 2012-08-21 14:45:33.761
Updated: 2012-08-21 14:45:33.761
        
see http://code.google.com/p/gbif-occurrencestore/source/detail?r=1580
and http://code.google.com/p/gbif-occurrencestore/source/detail?r=1582