Issue 13489

Double indexing?

Reporter: feedback bot
Assignee: omeyn
Type: Bug
Summary: Double indexing?
Priority: Blocker
Resolution: Fixed
Status: Closed
Created: 2013-07-09 11:09:36.663
Updated: 2013-08-23 12:37:58.877
Resolved: 2013-08-21 14:56:55.098
Description: Hi,
First, the beta version of the new portal looks great. Congratulations on that.
Off course, this is the report a bug feature.... :)
I notice that the number of records provided for Florabank1 is more or less the double of the number of records that should be there. It gives 6.889.632 records, while the number of provided records is 3,474,204 (see screenie)

Kind regards,

*Reporter*: Dimitri Brosens
*E-mail*: []]]>

Attachment betaportalGBIF_Florabank1.PNG

Comment: [] Please can you check the live portal on this issue.  If it is really wrong in the new portal only, please reassign to []
Created: 2013-08-19 12:25:49.462
Updated: 2013-08-19 12:25:49.462

Comment: The live indexing database record count matches what we expect. The issue is now assigned to [] 
Created: 2013-08-20 10:21:11.7
Updated: 2013-08-20 10:21:11.7

Comment: I checked the Solr index and the hive table and both report 6,889,632 records for dataset 271c444f-f8d8-4986-b748-e7367755c0c1(Florabank1 - A grid-based database on vascular plant distribution in the northern part of Belgium (Flanders and the Brussels Capital region)), apparently we are doubling the amount of records somewhere, the data portal reports 3,474,135 records
Created: 2013-08-20 14:45:42.214
Updated: 2013-08-20 14:45:42.214

Comment: The HBase table has two copies with different collection codes, so looks like a timing issue during the copy - a delete was probably scheduled but didn't execute before the copy was made from mysql. I'll delete the copy with the old collection code from dev/staging.
Created: 2013-08-20 15:34:19.962
Updated: 2013-08-20 15:34:19.962

Comment: deleted from dev_occurrence which will propagate to uat with next data copy (0.5 release)
Created: 2013-08-21 14:56:55.122
Updated: 2013-08-21 14:56:55.122