Issue 17303

DwC-A constituent metadata not found by dwca-metasync

17303
Reporter: mdoering
Assignee: mdoering
Type: Bug
Summary: DwC-A constituent metadata not found by dwca-metasync
Priority: Critical
Resolution: Fixed
Status: Closed
Created: 2015-02-23 11:40:53.574
Updated: 2015-02-23 18:00:39.012
Resolved: 2015-02-23 18:00:20.478
        
Description: The dwca-metasync does not find any of the dataset constituents metadata files. For example the CoL has about a hundred of them, but metasync reports there are zero constituents found in the archive:

23.02 11:20:35	dwca-metasync	INFO	0 constituents metadata found in archive 7ddf754f-d193-4cc9-b351-99906754a03b

Looking at the decompressed dwca folder in storage one can see that all files get extracted into the root folder of storage/dwca/7ddf754f-d193-4cc9-b351-99906754a03b instead of keeping them in the expected dataset subfolder.

We should extract archive files keeping their subfolder during crawling]]>
    


Author: mdoering@gbif.org
Comment: Keeping subfolders during decompression: https://github.com/gbif/crawler/commit/3cd5e38bfbc67f466156b6a109d35b8a1b0c164f
Created: 2015-02-23 11:48:33.694
Updated: 2015-02-23 11:48:33.694


Author: mdoering@gbif.org
Comment: CoL constituents are found now
Created: 2015-02-23 18:00:20.501
Updated: 2015-02-23 18:00:20.501