Issue 12215

Create a DwC-A "Fragmenter"

12215
Reporter: lfrancke
Assignee: lfrancke
Type: NewFeature
Summary: Create a DwC-A "Fragmenter"
Priority: Major
Resolution: Fixed
Status: Closed
Created: 2012-11-05 17:13:21.61
Updated: 2013-12-17 15:46:36.472
Resolved: 2013-01-18 15:52:12.386
        
Description: This tool needs to act on the messages from the DwC-A validator. It should read the content of the DwC-A file and emit the records in it one at a time to be processed by the Occurrence processing.

As an optimization we could take a hash of the content before processing and compare it to the last processed value, skipping processing if it's the same.

The dwca-reader might need changing to emit the rawest form possible. This needs discussion with [~mdoering] and [~omeyn].]]>
    


Author: mdoering@gbif.org
Comment: I was planning to create a new multi module project for the other 3 dwca crawler related projects. Mind adding this one to it?
Created: 2012-11-14 23:53:27.432
Updated: 2012-11-14 23:53:27.432


Author: lfrancke@gbif.org
Comment: Yeah, this one is about three classes and only a few lines of code so I've added it to the crawler-cli project directly (similar to the normal fragmenter)
Created: 2012-11-14 23:56:45.676
Updated: 2012-11-14 23:56:45.676


Author: mdoering@gbif.org
Comment: should we do the same for the others considering that we keep a core validator library separate? The metasync and downloader one will be fairly simple too and if the core validation is in a separate library the cli wrapping should be simple too. Your call, but I think its best to keep all of them together.
Created: 2012-11-15 00:02:01.908
Updated: 2012-11-15 00:02:01.908