Issue 13527

This download produces invalid files. There are...

13527
Reporter: trobertson
Assignee: fmendez
Type: Bug
Summary: This download produces invalid files.    There are...
Resolution: Fixed
Status: Closed
Created: 2013-07-22 19:38:21.012
Updated: 2016-09-28 17:02:59.544
Resolved: 2013-07-29 13:57:31.007
        
        
Description: This download produces invalid files.

There are delimitation issue in the file (looks like tabs in the scientific name authorship).

Perhaps a UDF to replace each tab with a space?

This is a critical issue]]>
    


Author: mdoering@gbif.org
Comment: wouldn't it be better to normalize whitespace when we index instead of replacing it when exporting? Does it come from the nub?
Created: 2013-07-22 19:40:54.802
Updated: 2013-07-22 19:40:54.802


Author: kbraak@gbif.org
Comment: Apparently unwanted newline characters are getting into the download also. See PF-96 as reported by Nicolas 
Created: 2013-07-24 10:41:19.23
Updated: 2013-07-24 10:48:13.101


Author: fmendez@gbif.org
Comment: i can't find the problem with the reported file, the query is removing the tabs doing something like this: regexp_replace(cleanNull(scientific_name_author),"\t|\n"," ") AS scientific_name_author
Created: 2013-07-24 15:04:42.517
Updated: 2013-07-24 15:04:42.517


Author: fmendez@gbif.org
Created: 2013-07-29 13:57:31.039
Updated: 2013-07-29 13:57:31.039
        
Tabs and "line break" characters are removed from uninterpreted text fields

http://code.google.com/p/gbif-occurrencestore/source/detail?r=2085