Issue 18273

Broken links and unrecognised values in Brazilian dataset

Reporter: rdmpage
Created: 2016-02-28 21:09:54.754
Updated: 2016-02-29 13:35:32.353
Description: The Brazilian Flora dataset has some problems.

It looks like every link shown in the "Overview" section for each taxon is broken, e.g. for to a 404. I'm guessing that the site has been reorganised since these links were stored in the database.

The values for some standard fields are in Portuguese, and hence aren't recognised. For example, 63,932 have nomenclatural status "unknown", 48,401 have taxonomic status unknown

This is likely because the source has values such as "NOME_ACEITO" for the field "DWC:TAXONOMICSTATUS"]]>

Created: 2016-02-29 10:36:28.287
Updated: 2016-02-29 10:36:28.287
This is strange, I just indexed some last week and the few links I tested were fine then. I also updated our parsers adding the missing portuguese status & rank values:

This is a link of a currently working species:

Using this format the link works:
The one from the dwca does not:

Publisher contacted.

Created: 2016-02-29 13:14:56.227
Updated: 2016-02-29 13:15:32.561
Rod, I have reindexed the flora and most parsing problems have vanished:

The vernacular name ones are also bad code which I just fixed but havent deployed yet

Author: rdmpage
Created: 2016-02-29 13:25:42.407
Updated: 2016-02-29 13:25:42.407
Hi Markus,

Hmmm, but the stats page shows massive failure to match the backbone ?!


Created: 2016-02-29 13:35:00.008
Updated: 2016-02-29 13:35:32.348
Yes, sth wrong in our indexing code:
No idea yet what that is ... there were 17000 names or so previously genuinely not found in our backbone