Issue 18397

Occurrence dataset not indexed

18397
Reporter: rdmpage
Assignee: jlegind
Type: Feedback
Summary: Occurrence dataset not indexed
Description: I've uploaded a dataset with 22,000 odd occurrences but so far it doesn't seem to be indexed. Have I done something wrong? The data set is http://www.gbif.org/dataset/33614778-513a-4ec0-814d-125021cca5fe  Is the log for this dataset publicly accessible so that I can see war's gone wrong?
Priority: Major
Resolution: Fixed
Status: Closed
Created: 2016-04-13 15:46:56.411
Updated: 2016-04-18 09:13:48.704
Resolved: 2016-04-18 09:13:48.571


Author: jlegind@gbif.org
Created: 2016-04-13 16:12:34.745
Updated: 2016-04-13 16:12:34.745
        
Hi Rod, I launched a crawl of the resource and the data is now visible in the portal.

Can you explain the steps you took to upload the dataset - it would help to debug the process.
    


Author: rdmpage
Created: 2016-04-13 18:35:59.354
Updated: 2016-04-13 18:35:59.354
        
I kick it old skool, I call the API directly to create the data set and point it to the data.

POST to http://api.gbif.org/v1/dataset to create
POST to http://api.gbif.org/v1/dataset/33614778-513a-4ec0-814d-125021cca5fe/endpoint to set endpoint

This is the same way I uploaded The Plant List data.

I've noticed that the eml.xml file didn't render as I expected, probably because the  element has some embedded HTML (I loath EML). I wonder if that caused a problem?
    


Author: jlegind@gbif.org
Created: 2016-04-15 10:48:54.232
Updated: 2016-04-15 10:48:54.232
        
I think you need to launch a crawl as well:

POST /dataset/33614778-513a-4ec0-814d-125021cca5fe/crawl


    


Author: rdmpage
Created: 2016-04-15 10:59:42.825
Updated: 2016-04-15 10:59:42.825
        
Yes, I realised that yesterday (doh!). I'd thought I'd been doing that but the script I used was copied and pasted from the script to set the endpoint, so it did that rather than launch a crawl. I then discovered that I'd done the same thing for The Plant List dataset, which was why it had several links to the same Darwin Core Archive instead of just one (I've since deleted the extra links). Once I figured this out did the right thing and the data is now showing up nicely.

P.S. Don't tell Tim R. or he'll revoke my hacking privileges on the grounds of being an idiot ;)
    


Author: trobertson@gbif.org
Created: 2016-04-15 11:01:59.585
Updated: 2016-04-15 11:01:59.585
        
Not at all.

Please be aware though that that crawling will be off until new backbone is in place.  I sent a message to the api-users list yesterday about that.

Crawling starting again - ETA: Tuesday
    


Author: rdmpage
Comment: Thanks Tim, yes, I'd seen that email on the API list. Looking forward to the new backbone being in place.
Created: 2016-04-15 11:04:40.104
Updated: 2016-04-15 11:04:40.104