Issue 12611

Invalid checklist archive still indexed - content is corrupted

12611
Reporter: kbraak
Assignee: mdoering
Type: Bug
Summary: Invalid checklist archive still indexed - content is corrupted
Priority: Major
Resolution: Invalid
Status: Resolved
Created: 2013-01-21 16:43:11.667
Updated: 2015-03-02 15:28:02.985
Resolved: 2015-03-02 15:28:02.961
        
Description: An invalid DwC-A checklist (USDA PLANTS Database) still gets indexed. The result is corrupt data in the index, which gets displayed in the Portal.

To illustrate the problem, I attach various screenshots, and include relevant links below:

1) Checklistbank page for USDA Plants Database http://ecat-dev.gbif.org/checklist/1014
2) DwC-A Validation report - everything seems to be broken. Apparently validation wasn't able to even finish writing the final report, http://tools.gbif.org/dwca-reports/021-1497717370733740636.html
3) Portal page for USDA Plants Database: http://staging.gbif.org:8080/portal-web-dynamic/dataset/705922f7-5ba5-49ab-a75d-722e3090e690
]]>
    

Attachment Screen Shot 2013-01-21 at 4.23.24 PM.png


Attachment Screen Shot 2013-01-21 at 4.32.10 PM.png


Attachment Screen Shot 2013-01-21 at 4.37.30 PM.png

Attachment usda_archive.zip


Author: kbraak@gbif.org
Comment: USDA Plants Database DwC-A attached  
Created: 2013-01-21 16:47:00.046
Updated: 2013-01-21 16:47:00.046


Author: kbraak@gbif.org
Comment: Description of errors encountered in DwC-A, provided under separate DwC-A validator issue: http://code.google.com/p/darwincore/issues/detail?id=165
Created: 2013-01-21 17:32:12.35
Updated: 2013-01-21 17:32:12.35


Author: mdoering@gbif.org
Comment: dwca files have to pass the validator now using rabbit based crawling
Created: 2015-03-02 15:28:02.982
Updated: 2015-03-02 15:28:02.982