Issue 18755

Newline in recordedBy field being passed through to downloads

18755
Reporter: nickyn
Type: Feedback
Summary: Newline in recordedBy field being passed through to downloads
Resolution: Fixed
Status: Closed
Created: 2016-09-28 18:54:31.041
Updated: 2016-10-04 17:15:33.796
Resolved: 2016-10-04 17:15:33.752
        
        
Description: This record includes a newline in the recordedBy field. Looks to be causing a problem in a DwCA download including this record as the recordedby vaue is truncated and the following line in the included occurrence.txt does not start with a GBIF id.

{
  "key": 211642239,
  "datasetKey": "96b7961a-f762-11e1-a439-00145eb45e9a",
  "publishingOrgKey": "95cb537c-74c5-4c1e-ae24-32e7ea08f380",
  "publishingCountry": "ES",
  "protocol": "TAPIR",
  "lastCrawled": "2013-09-07T07:08:01.000+0000",
  "crawlId": 1,
  "extensions": {},
  "basisOfRecord": "PRESERVED_SPECIMEN",
  "taxonKey": 2874237,
  "kingdomKey": 6,
  "phylumKey": 7707728,
  "classKey": 220,
  "orderKey": 1414,
  "familyKey": 6631,
  "genusKey": 2874237,
  "scientificName": "Viola L.",
  "kingdom": "Plantae",
  "phylum": "Tracheophyta",
  "order": "Malpighiales",
  "family": "Violaceae",
  "genus": "Viola",
  "genericName": "Viola",
  "taxonRank": "GENUS",
  "stateProvince": "Le",
  "year": 1980,
  "month": 7,
  "day": 18,
  "eventDate": "1980-07-18T00:00:00.000+0000",
  "issues": [
    "TAXON_MATCH_HIGHERRANK"
  ],
  "lastInterpreted": "2016-08-04T10:22:34.495+0000",
  "license": "http://creativecommons.org/licenses/by-nc/4.0/legalcode",
  "identifiers": [],
  "facts": [],
  "relations": [],
  "class": "Magnoliopsida",
  "countryCode": "ES",
  "country": "Spain",
  "recordedBy": "H.S. Nava & Mª.A. Fdez. Casado\nMª.A. Fdez. Casado",
  "catalogNumber": "7541-1",
  "institutionCode": "FCO",
  "locality": "Vegarada",
  "collectionCode": "FCO",
  "gbifID": "211642239"
}
]]>
    


Author: cgendreau
Comment: Same cause as PF-2622
Created: 2016-09-30 09:50:56.401
Updated: 2016-09-30 09:50:56.401


Author: nickyn
Created: 2016-10-04 13:35:41.428
Updated: 2016-10-04 13:35:41.428
        
Although comment on http://dev.gbif.org/issues/browse/PF-2622 says fixed, a download (ID 0015410-160910150852091) requested after deployment of fix (2016-10-03) includes the same problems (newlines splitting data lines):
The following records included in the download show these problems.
GBIF occurrence IDs including a newline in the recordedby field:
46994803
272168089
204419285
111300924
214714290
These records include a new line in the locality field:
788898353
730641673

Note that this request was for a simple CSV download not a full DWCA download request.
    


Author: cgendreau
Created: 2016-10-04 13:48:32.256
Updated: 2016-10-04 13:48:32.256
        
Hi Nicky,

The deployment was started yesterday but the part that is building the table for downloads is still running at the moment.

I will test your download (same query as 0015410-160910150852091) once finished and close the issue(s) if it succeeds otherwise I will add a comment.

Sorry for the delay.