Add hosting country and owning country to the occurrence store
Summary: Add hosting country and owning country to the occurrence store
Created: 2012-10-11 11:25:31.808
Updated: 2013-12-17 15:16:56.632
Resolved: 2012-10-15 14:39:36.108
Description: GBIF are often asked to do national summary counts based on the country hosting the data, and the country owning the data. To enable "group by counts, propose these be stored on the record level to allow (e.g.)
/* Count of records by the country responsible for hosting content */
SELECT host_country_iso, COUNT( *)
GROUP by host_country_iso
One example of this use will be in the Cube which will serve a replacement for the existing country repatriation table. http://data.gbif.org/countries/datasharing
This information is captured within the registry, so just like the publisher can be determined from the dataset UUID, so should these fields.
Created: 2012-10-11 12:41:05.836
Updated: 2012-10-11 12:41:05.836
Do you propose to add these fields to the Occurrence class (not only hbase) for the cube?
Im slightly uneasy about adding any dataset properties to our data/record objects as we would need to push any changes in the registry down to the occurrencestore, flush the occurrence varnish caches, etc. Might it be an alternative to also pass a Dataset object into the cube for incrementing its counters or the cube to maintain a small dataset cache and lookup needed dataset object itself?
In any case a change in the registry might have significant impact on the cube so it needs to be recalculated, nasty.