Issue 11270

Add new DatasetMetrics class

11270
Reporter: mdoering
Type: Improvement
Summary: Add new DatasetMetrics class
Priority: Major
Resolution: Fixed
Status: Closed
Created: 2012-05-31 14:27:17.388
Updated: 2013-12-16 17:50:33.899
Resolved: 2012-08-09 14:05:19.301
        
Description: Instead of adding more metrics properties directly to the Dataset class a new DatasetMetrics class is proposed to hold all metrics about a specific indexed dataset in time.
This allows to keep the history and derive graphs and furthers statistics.

The class consists of a number of int properties plus timestamps for:

 dateRetrieved (time of the dwca download or last completed harvesting)
 dateIndexed (time the data got indexed and interpreted, could be several times on the basis of the same source data)
 recordsIndexed
 recordsUnreadable (sometimes bad records break transfer protocols, so are deemed unreadable)

 -- breakdown by Kingdom
  recordsAnimalia
  recordsPlantae
  ...

 -- breakdown by Rank
  numRankKingdom
  ...
  numRankGenus
  numRankSpecies
  numRankInfraspecific

 -- breakdown by BasisOfRecord
  recordsObservation
  recordsSpecimen
  ...]]>
    


Author: kbraak@gbif.org
Comment: Can I just confirm what is meant by indexed? Does this represent writing to raw_occurrence_record for example, or is it when the raw_occurrence_records have been interpreted into occurrence_records? From your description above, it sounds like the latter. 
Created: 2012-05-31 14:34:17.108
Updated: 2012-05-31 14:34:17.108


Author: mdoering@gbif.org
Created: 2012-05-31 14:42:31.882
Updated: 2012-05-31 14:42:31.882
        
Id say interpretation is the key as this might change based on the same raw sources. So the later, yes.
For checklist bank this happens at the same time as there is no raw records table (yet).

Maybe rename it to dateInterpreted or even better dateProcessed instead?
    


Author: kbraak@gbif.org
Comment: I like dataProcessed because it removes that ambiguity. At least it's clear for me what we mean, so either way is fine for me.
Created: 2012-05-31 15:00:51.469
Updated: 2012-05-31 15:00:51.469


Author: mdoering@gbif.org
Comment: added to API, but not linked to Dataset yet and no persistency done
Created: 2012-05-31 16:44:42.61
Updated: 2012-05-31 16:44:42.61


Author: mdoering@gbif.org
Comment: Renamed into NetworkEntityMetrics and added a get method for all entities in the service. See http://code.google.com/p/gbif-registry/source/detail?r=3185
Created: 2012-08-09 14:05:19.335
Updated: 2012-08-09 14:05:19.335