Issue 17789
Add crawlId to the occurrence SOLR schema
17789
Reporter: trobertson
Type: Improvement
Summary: Add crawlId to the occurrence SOLR schema
Priority: Major
Resolution: Fixed
Status: Closed
Created: 2015-09-04 13:52:03.053
Updated: 2018-05-31 16:41:11.504
Resolved: 2018-05-31 16:41:11.479
Description: With the forthcoming ability to facet, being able to facet counts on crawlId will be invaluable to determine records for deletion.
We regularly run SQL like this to determine records to remove which is highly ineffectient:
{code}
select crawlId, count(*)
from prod_b.occurrence_hdfs
where datasetKey='ea95fd9b-58dc-4e48-b51f-9380e9804607'
group by crawlId.
select gbifId
from prod_b.occurrence_hdfs
where datasetKey='...' and crawlID NOT ....
{code}]]>