Issue 14854
Occurrence indexing validator should ignore empty dwca lines/records
14854
Reporter: mdoering
Type: Bug
Summary: Occurrence indexing validator should ignore empty dwca lines/records
Priority: Critical
Status: Open
Created: 2014-01-21 11:41:34.073
Updated: 2015-03-10 11:00:32.565
Description: According to Jan some datasets have empty records consisting just of whitespace which will be read by the dwca reader. As they all have empty string ids the archive is then rejected by the crawler validator as having non unique occurrence ids. We need to simply ignore those records everywhere.
Example dataset:
http://www.gbif.fr:8080/ipt/archive.do?r=guyane_coffea
These datasets under the INRA Antilles-Guyane publisher all have that problem:
http://registry.gbif.org/web/index.html#/organization/4bce8f23-20a5-48a7-b25a-83700caad2db/owned
- Guyane_française_caféier
- Guadeloupe_Ananas
- Guadeloupe_Bananier
- Guyane_Cacaoyer
- Guadeloupe_Taro
- Guadeloupe_Manguier]]>