Issue 18490

Remove support for dc:rights

18490
Reporter: kbraak
Type: Improvement
Summary: Remove support for dc:rights 
Priority: Unassessed
Status: Open
Created: 2016-05-23 17:22:22.847
Updated: 2016-07-14 16:10:46.752
        
Description: Currently we still show dc:rights on the occurrence detail page if it's populated, e.g. http://www.gbif.org/occurrence/620536225#legal

dc:rights was deprecated in favor of dc:licenses. See [here|https://github.com/gbif/ipt/blob/master/v2.3-spec.md#deprecated-terms] for a list of all DwC terms that were recently deprecated.

With regards to dc:rights, GBIF.org needs to:

* no longer show dc:rights on occurrence detail page
* show dc:license on occurrence detail page instead
* on new records, make sure we ignore dc:rights during indexing
* for existing records, set dc:rights to NULL (~21M records, 236 datasets)
* update [GBIF Licensing Policy|http://www.gbif.org/terms/licences] to reflect that we are not supporting dc:rights (record-level rights statements) any more
]]>
    


Author: ahahn@gbif.org
Created: 2016-05-25 13:23:41.575
Updated: 2016-05-25 13:23:41.575
        
With 21M records that to date do supply a rights statement, we need to consider where a communication with publishers is required. 236 sounds like a lot, but it would be worth checking how many of those supply database defaults or placeholder values instead of content-providing statements. To check on this, I would propose to re-run an occurrence-wide search to summarize
- publisher UUID
- dataset UUID
- publishing protocol used by the dataset (BioCASe, DiGIR, TAPIR, or IPT)
- distinct rights field content per dataset
- record count for how many records this value applies to (within the dataset)
This table will need to be screened in order to estimate the impact (likely unhappiness factor) and to make a decision on best ways to communicate the upcoming change, i.e. the deletion of the rights values from GBIF
    


Author: ahahn@gbif.org
Comment: I agree we need to communicate / document that dc:rights is no longer supported. I am not sure that the licensing policy is the right place, though. Need to investigate further.
Created: 2016-05-25 13:25:39.784
Updated: 2016-05-25 13:25:39.784


Author: kbraak@gbif.org
Created: 2016-06-02 12:12:31.896
Updated: 2016-06-02 12:13:31.124
        
Hi Andrea, [~cgendreau] just conducted a DwC term frequency analysis. It doesn't answer all the questions you were hoping to answer, but it does reveal the following:

* dc:rights is used in 240,590,396 records. [211,883,652|http://www.gbif.org/occurrence/search?datasetKey=4fa7b334-ce0d-4e88-aaae-2e0c138d049e] of those appear to come from eBird. Excluding eBird, that leaves 28,706,744 records with dc:rights.
* dc:license is used in 26,449,392 records.

Given that dc:rights was deprecated in favor of dc:license on [2014-11-06|http://rs.tdwg.org/dwc/terms/history/decisions/], it's good to see a solid frequency of use for dc:license. We need to continue to promote the use of dc:license over dc:rights. Fortunately data published using the IPT v2.3 or higher will automatically migrate dc:rights to dc:license whenever the IPT admin updates the Occurrence Core to the [latest version|http://rs.gbif.org/core/dwc_occurrence_2015-07-02.xml] and republishes new versions of the data. This hasn't happened with eBird yet, because the IPT hosting it is severely out of date (version 2.1.1 released in April 2014). 
    


Author: kbraak@gbif.org
Created: 2016-07-14 16:10:46.752
Updated: 2016-07-14 16:10:46.752
        
[~ahahn@gbif.org]
[~hoefft]
[~bko@gbif.org]

Below I summarize decisions on how to handle record-level rights statements (dc:rights) and record-level license statements (dc:license) on the occurrence detail page. These decisions relate to both this Jira and POR-3113. Web team please take note, as this has consequences for both the current and reengineered occurrence detail pages:

* dc:rights won't be shown on the interpreted occurrence detail page - only on the verbatim occurrence detail page.
* Record-level licenses won't be interpreted for now. During interpretation, dc:license will be populated from the dataset-level license and shown on the interpreted occurrence detail page. The verbatim dc:license will only be shown on the verbatim occurrence page.
* GBIF must update its [Licensing Policy|http://www.gbif.org/terms/licences] to explain that GBIF is not supporting record-level rights statements (dc:rights) or record-level licenses (dc:license) any more. In other words we need to explain how the dataset-level license gets applied to each record during indexing and how this will limit usage of the dataset whenever it contains a subset of records having less restrictive licenses.