Issue 18604

Allow searching datasets by their project identifier

18604
Reporter: kbraak
Assignee: bko
Type: NewFeature
Summary: Allow searching datasets by their project identifier 
Priority: Unassessed
Resolution: Fixed
Status: Closed
Created: 2016-06-22 16:48:57.145
Updated: 2016-12-29 09:00:41.641
Resolved: 2016-12-28 21:37:02.185
        
Description: The GBIF Project model object is being updated with an identifier field (see POR-2562).

It would be nice to search and filter datasets by their project identifier, used to link multiple datasets associated to the same project together. Partial matching should be possible to enable searching for all projects whose identifiers start with a prefix, e.g. "BID-AF2015".

Projects such as BID, BioFresh and OpenUp generate a lot of primary biodiversity data. It is important to highlight these projects, give credit to their funders, and showcase all the data they help mobilise (see POR-3100)]]>
    


Author: smasinde
Comment: I very much support this as it will help in quickly generating lists of published datasets when reporting to project funders, e.g., BID reports to EU.
Created: 2016-09-02 16:12:13.929
Updated: 2016-09-02 16:12:13.929


Author: kbraak@gbif.org
Created: 2016-09-09 12:06:26.445
Updated: 2016-09-09 12:06:26.445
        
[~fmendez@gbif.org] Implementation details just to keep track of them..

We will add [Project.identifier|https://github.com/gbif/gbif-api/blob/master/src/main/java/org/gbif/api/model/registry/eml/Project.java#L39] as a field in SOLR. The SOLR API will enable analytics, such as list datasets of project X faceted by country, etc. Additionally, [~hoefft] please keep in mind that we may like to add a project facet on the dataset search on the reengineered portal.

[~smasinde] we need to provide strict instructions to publishers on how to populate the project identifier in the metadata to avoid garbage coming in. 
    


Author: bko@gbif.org
Comment: About a project, for now we have textual information stored in the CMS as a project content type. To possibly relate/mesh relevant information together in either dataset page/project page the content model of the project will need to have such a field and be exposed via both cms project and search endpoints. If this is correct, please do prompt me when this issue is being addressed.
Created: 2016-09-09 12:12:18.692
Updated: 2016-09-09 12:12:18.692


Author: kbraak@gbif.org
Created: 2016-09-27 11:03:57.654
Updated: 2016-09-27 11:03:57.654
        
[~bko@gbif.org] [~hoefft]

@Web team, to put on your radar that [~fmendez@gbif.org] will try to make the implementation described above next week, so that we can have it ready to test in dev late next week. Hopefully you can earmark some time the week after to do an update the Project page that uses the new SOLR request to list the projects' datasets by their identifier. Thanks.
    


Author: kbraak@gbif.org
Created: 2016-09-27 16:01:28.434
Updated: 2016-09-27 16:01:28.434
        
[~bko@gbif.org] [~hoefft]

@Web team, [~fmendez@gbif.org] has finished the implementation well ahead of schedule.

The new SOLR request is now available to test from DEV.

Three datasets' eml.xml were updated to include a project identifier:

* 1 dataset with project identifier "BID-AF2015-0122-KYLE": http://api.gbif-dev.org/v1/dataset/search?project_id=BID-AF2015-0122-KYLE
* 2 datasets with project identifier "BID-AF2015-0122-FEDE": http://api.gbif-dev.org/v1/dataset/search?project_id=BID-AF2015-0122-FEDE

Burke, can you please update the Project page to allow the content manager to specify a single project identifier for the Project, and then use that identifier to query SOLR for a list of datasets related to the project for display on the page? Thanks.

This change to SOLR also enables a new facet by project identifier on the new portal should we decide to add that.


    


Author: hoefft
Created: 2016-09-28 08:56:22.349
Updated: 2016-09-28 08:56:22.349
        
Dataset search
The free text search captures these nicely http://api.gbif-dev.org/v1/dataset/search?q=BID-AF2015-0122-KYLE
we could easily create a dedicated search field for project id as well. But even as it is it would work

    


Author: bko@gbif.org
Created: 2016-09-28 13:54:53.259
Updated: 2016-09-28 13:54:53.259
        
For the project page to show the indicated datasets, I believe a dedicated field is required. Search with ?q=BID-AF2015-0122-KYLE or ?project_id=BID-AF2015-0122-KYLE means different things.

Think of a scenario that other projects mention other related projects or collaboration with the same project ID in their metadata, in the generic search these could all appear in the result, which is good for dataset search. But for project page, I think we only want to show that specific one.
    


Author: bko@gbif.org
Created: 2016-10-03 17:01:56.915
Updated: 2016-10-03 17:01:56.915
        
An example on the dev environment:
http://internal.gbif-dev.org/project/programme/bid/project/africa/2015/strengthening-togo-stakeholder-network
http://internal.gbif-dev.org/project/project/2016-guinea-biodiversity-information-network

Please note the datasets for now are listed there for demoing purpose, it's not dataset produced by the project.

Project ID of those BID projects can NOW be filled in at drupaledit.gbif.org, and later when there are datasets published per instruction via IPT, they should show on the project page of the new site.
    


Author: bko@gbif.org
Comment: Is this in production (http://api.gbif.org/v1/) yet?
Created: 2016-10-20 12:40:12.34
Updated: 2016-10-20 12:40:12.34


Author: trobertson@gbif.org
Created: 2016-12-28 21:37:02.273
Updated: 2016-12-28 21:37:02.273
        
See http://api.gbif.org/v1/dataset/search?facet=project_id&limit=0

and
  http://api.gbif.org/v1/dataset/search?project_id=AA003-AA003311F
    


Author: trobertson@gbif.org
Comment: (It will be surfaced in the new site only for user search)
Created: 2016-12-28 21:38:27.733
Updated: 2016-12-28 21:38:27.733


Author: trobertson@gbif.org
Created: 2016-12-29 09:00:41.641
Updated: 2016-12-29 09:00:41.641
        
[~fmendez@gbif.org] added this in the existing portal as well with this commit (will be visible on next deploy)
  https://github.com/gbif/portal-web/commit/426aa9e150992fec6d070212cea2a95a11cd9d50