Issue 17781

single CSV file as dwca endpoint fails metasync

17781
Reporter: mdoering
Type: Bug
Summary: single CSV file as dwca endpoint fails metasync
Priority: Critical
Status: Open
Created: 2015-08-28 21:54:08.363
Updated: 2015-08-28 21:56:24.879
        
Description: Trying to index https://raw.githubusercontent.com/mdoering/ion-taxonomic-hierarchy/master/classification.tsv from http://www.gbif.org/dataset/8dc469b3-8e61-4f6f-b9db-c70dbbc8858c

Kibana has this:
org.gbif.dwca.io.UnsupportedArchiveException: The archive given is a folder with more or less than 1 data files having a csv, txt or tab suffix
    at org.gbif.dwca.io.ArchiveFactory.openArchive(ArchiveFactory.java:366) ~[crawler-cli.jar:na]
    at org.gbif.crawler.dwca.metasync.DwcaMetasyncService$DwcaValidationFinishedMessageCallback.handleMessageInternal(DwcaMetasyncService.java:116) [crawler-cli.jar:na]


In the dwca storage directory there is a single file with the .dwca suffix inside the dataset uuid folder.
The suffix dwca is not recognised by the dwca-reader/io library - it needs to be a csv or txt suffix!]]>