Issue 14854

Occurrence indexing validator should ignore empty dwca lines/records

14854
Reporter: mdoering
Type: Bug
Summary: Occurrence indexing validator should ignore empty dwca lines/records
Priority: Critical
Status: Open
Created: 2014-01-21 11:41:34.073
Updated: 2015-03-10 11:00:32.565
        
Description: According to Jan some datasets have empty records consisting just of whitespace which will be read by the dwca reader. As they all have empty string ids the archive is then rejected by the crawler validator as having non unique occurrence ids. We need to simply ignore those records everywhere.

Example dataset:
http://www.gbif.fr:8080/ipt/archive.do?r=guyane_coffea


These datasets under the INRA Antilles-Guyane publisher all have that problem:
http://registry.gbif.org/web/index.html#/organization/4bce8f23-20a5-48a7-b25a-83700caad2db/owned

- Guyane_française_caféier
- Guadeloupe_Ananas
- Guadeloupe_Bananier
- Guyane_Cacaoyer
- Guadeloupe_Taro
- Guadeloupe_Manguier]]>