Issue 15571

Archive reader sorting bug during clb indexing

15571
Reporter: mdoering
Assignee: mdoering
Type: Bug
Summary: Archive reader sorting bug during clb indexing
Priority: Blocker
Resolution: CantReproduce
Status: Resolved
Created: 2014-04-29 14:32:22.569
Updated: 2014-05-15 13:55:58.689
Resolved: 2014-05-15 13:55:58.647
        
Description: A few plazi archives are failing during indexing with an archive reader error on opening the files:

9024:ERROR [2014-04-29 11:23:28,323+0200] [main] org.gbif.dwc.text.Archive: Error sorting extension file /var/local/indexing/clbindexing/checklists/2359/occurrences.txt : Comparison method violates its general contract!
9076:ERROR [2014-04-29 11:23:29,083+0200] [main] org.gbif.dwc.text.Archive: Error sorting extension file /var/local/indexing/clbindexing/checklists/2360/occurrences.txt : Comparison method violates its general contract!
13316:ERROR [2014-04-29 11:23:30,318+0200] [main] org.gbif.dwc.text.Archive: Error sorting extension file /var/local/indexing/clbindexing/checklists/3237/occurrences.txt : Comparison method violates its general contract!
17361:ERROR [2014-04-29 11:28:43,587+0200] [main] org.gbif.dwc.text.Archive: Error sorting extension file /var/local/indexing/clbindexing/checklists/3443/occurrences.txt : Comparison method violates its general contract!
26636:ERROR [2014-04-29 11:50:03,639+0200] [main] org.gbif.dwc.text.Archive: Error sorting extension file /var/local/indexing/clbindexing/checklists/3592/occurrences.txt : Comparison method violates its general contract!


Need to further investigate whats going on.

Archives in question:
 http://plazi.cs.umb.edu/GgServer/dwca/02D6094C6E9F74C855AF30382F8F7B2D.zip  (2360)
 http://plazi.cs.umb.edu/GgServer/dwca/9228382C34481F0728E91B2858E96999.zip  (3237)
 http://plazi.cs.umb.edu/GgServer/dwca/2ABCDBCE441E554E596A29F1744A0A61.zip  (2359)
 http://plazi.cs.umb.edu/GgServer/dwca/E3B79C8FF6B0B87BE406EA4948E96A3B.zip  (3443)
 http://plazi.cs.umb.edu/GgServer/dwca/6632D8151686A3F8C71D4B5A5B1181A4.zip  (3592)
 ]]>
    


Author: mdoering@gbif.org
Comment: All archive work fine with the reader, no exceptions when iterating over star records. There are a few other issues like duplicates taxon keys and bad eml.xml referring from the meta.xml which I reported to Plazi (Guido), but nothing fatal like this issue suggests. Closing as invalid
Created: 2014-05-15 13:53:42.633
Updated: 2014-05-15 13:53:42.633