Issue 11465

German language descriptions on species page shows incorrect characters

11465
Reporter: trobertson
Assignee: mdoering
Type: Bug
Summary: German language descriptions on species page shows incorrect characters
Description: See plantae - character encoding issues such as "Beispiel für den" which seem to come from wikipedia
Priority: Blocker
Resolution: Fixed
Status: Closed
Created: 2012-06-22 14:26:21.022
Updated: 2013-08-29 14:44:51.491
Resolved: 2012-10-24 17:24:18.233


Author: mdoering@gbif.org
Created: 2012-06-25 15:06:42.111
Updated: 2012-06-25 15:06:42.111
        
there are 2 issue here:
a) the wikipedia dwca builder needs to be improved to not generate such descriptions - but that might be very hard or even impossible to guarantee for bad source data.

b) the nub should try to ignore descriptions with bad characters, similar to how we treat bad vernacular names. This needs flagging of "bad" descriptions which currently is not possible in the data model
    


Author: trobertson@gbif.org
Comment: Until this is solved, why not simply remove that content (description) and only include it when it is ready?
Created: 2012-06-25 17:39:12.108
Updated: 2012-06-25 17:39:12.108


Author: mdoering@gbif.org
Created: 2012-10-24 17:24:18.286
Updated: 2012-10-24 17:24:18.286
        
Not an issue anymore since the wikipedia archive generator uses a new parser.
Plants and other wikipedia pages have proper characters since then:
http://staging.gbif.org:8080/portal-web-dynamic/species/6