Issue 18709

Problem with indexing new CWR dataset?

Reporter: kylecopas
Assignee: jlegind
Type: Feedback
Summary: Problem with indexing new CWR dataset?
Description: Freshly published dataset from CGIAR has 3,403,811 records on GBIF Norway's source IPT, but nothing's been indexed. I'd guess you might already be on this, Jan, but flagging it just in case...
Resolution: Fixed
Status: Closed
Created: 2016-08-30 07:28:49.092
Updated: 2017-10-05 20:55:15.776
Resolved: 2017-10-05 20:55:15.753

Comment: Crawling and indexing if down for maintenance at least until tomorrow. Should still be checked to see if it is picked up in indexing.
Created: 2016-08-30 11:57:56.344
Updated: 2016-08-30 11:57:56.344

Author: kylecopas
Created: 2016-08-30 12:16:01.67
Updated: 2016-08-30 12:16:01.67
Thanks, Jan.

@thirsch noted that the dataset might be in need of some more careful curation, as both the metadata and the link in the metadata suggest it's likely to duplicate records already in GBIF.

> Data was gathered from more than 100 data providers, including GBIF (a comprehensive list of institutions and individuals is available here:

The link notes that GBIF-mediated data were accessed in 7/2012. There are also many records identified as coming from EURISCO, which could duplicate those published in this dataset (which itself is potentially in flux as came up during the licensing consultation):

However, we may be able to enlist Dag's help in addressing these issues, given his involvement in getting the data published (or republished, maybe? I've lost track...)  

Comment: GBIF Norway were notified of this problem.
Created: 2016-08-30 15:35:20.622
Updated: 2016-08-30 15:35:20.622

Author: mblissett
Comment: Count is correct now.
Created: 2017-10-05 20:55:15.774
Updated: 2017-10-05 20:55:15.774