Issue 11259

Data extraction: failing due to nub content issues of current nub version, needs review when new nub is live

11259
Reporter: ahahn
Assignee: ahahn
Type: Bug
Summary: Data extraction: failing due to nub content issues of current nub version, needs review when new nub is live
Priority: Minor
Status: Open
Created: 2012-05-30 11:08:01.282
Updated: 2012-11-20 17:16:50.078
        
Description: For http://data.gbif.org/datasets/resource/14026/, extraction loses about 370 taxon names. Many of the causes should be fixed in the new nub version; check extraction of this dataset when the first rollover with the new nub has finished:
- binomials, e.g. Doris montereyensis, should be included from new sources
- monomials: homonym issues should be resolved, e.g. for Asteraceae
Issues due to intra-rank names (subclass, subfamily) will not be resolved in the near future.

select distinct scientific_name,rank,kingdom,phylum,family from raw_occurrence_record ror left join occurrence_record occr on ror.id=occr.id where ror.data_resource_id = 14026 and occr.id is null order by 1;]]>
    
Attachment untiedNamesFromINaturalist_rerun.txt
Attachment untiedNamesFromINaturalist.txt


Author: ahahn@gbif.org
Comment: Based on the same select statement as given in the issue description, the list is down to just one name in Nov. 2012
Created: 2012-11-20 16:21:29.221
Updated: 2012-11-20 16:21:29.221


Author: ahahn@gbif.org
Comment: Status change: only one uninterpreted hybrid name left. Leaving the issue open to check back again, as there is still a blocker existing, but downgrading the priority
Created: 2012-11-20 16:25:30.657
Updated: 2012-11-20 16:25:30.657