Issue 16425

dwca-reader misses some extension records

16425
Reporter: mdoering
Assignee: mblissett
Type: Bug
Summary: dwca-reader misses some extension records
Priority: Blocker
Resolution: Fixed
Status: Closed
Created: 2014-09-24 17:57:53.074
Updated: 2018-03-20 17:45:20.702
Resolved: 2018-03-20 17:45:05.775
        
Description: The start iterator inside the dwca reader does not return all extension records for some archives. Discovered in pensoft archives with occurrence extension having 740 records with only 661 getting through.

The bug is caused by different sorting of the data files which should follow the same sort order in the core and extensions based on the coreid.]]>
    


Author: mdoering@gbif.org
Created: 2014-09-24 18:00:09.169
Updated: 2014-09-24 18:00:09.169
        
Ignored test that shows the problem is here:
https://github.com/gbif/dwca-reader/blob/master/src/test/java/org/gbif/dwc/text/ArchiveFactoryTest.java#L195
    


Author: mblissett
Created: 2018-03-20 17:45:05.937
Updated: 2018-03-20 17:45:05.937
        
Resolved in https://github.com/gbif/gbif-common/commit/1768fc39e809550b3124372717f8f76301210b92 and thus https://github.com/gbif/dwca-io/commit/259fe9a502b37901bac304657c42859633ffe3bc.