Issue 14383

Demonstrate mobilization and discovery of sample-based data

14383
Reporter: ahahn
Assignee: kbraak
Type: Epic
Summary: Demonstrate mobilization and discovery of sample-based data
Priority: Major
Status: InProgress
Created: 2013-11-19 14:33:50.892
Updated: 2016-04-29 19:59:19.934
DueDate: 2014-12-31 00:00:00.0
        
Description: Identify suitable sample-based datasets, map them together. Check whether DwC-A is the best option, and potentially compare against other standards that may be better suited, involving the appropriate communities and avoiding alienation.

Explore options for further processing and visualisation - which outputs are needed on the portal side?

Expected outcomes: prototype implementation using real sample-based data; roadmap for the further process

*Demonstration of publishing of sample-based data through the GBIF network: milestone Dec 2014*

*Rationale*
Data users in many fields need to be able to understand whether data records have been collected as part of larger projects with consistent sampling methodologies, and to use this criterion to discover and download data.

Recent workshops have considered options to extend the Darwin Core definition to handle this situation. During 2014, GBIF will work with interested data publishers holding suitable data to trial the use of Darwin Core extensions for sample-based data and to demonstrate access to these elements through the GBIF portal and services.

*Required components*
- communication on planned procedure
- set of use cases (e.g. biomass assessment based on a vegetation plot)
- decision on the standard to be used / extended to
-- recognize occurrence records from a single sampling event
-- understand which sampling events are comparable across time and space
-- understand relative abundance of taxa within samples
-  IPT be adapted to be able to publish sample-based data
- identification of and decision on a pilot set of sample datasets (communications involved) - pilot data mobilization activity with existing publishers
- evaluation of options for using and presenting these data through search interfaces, maps etc
- roadmap for further process to
-- index
-- filter
-- download
-- visualize
- communication via news items / through GBits as appropriate
- no wider consultation of GBIF Nodes / publishers, consultation is focussed on external partners / consumers of data

*Dependencies*
- Required at the start of SEP2D (implementation in 2015, though possibly earlier); check project definition for requirements and dependencies to specify tasks; committed: training
- EMODNet task re. mobilizing sample based data (demonstrate mobilizing of data). Check work group report and nail down the definition of what to do (2014 task)
- EU BON: align activities, as the GBIF and EU BON work programmes share some common requirements and outcomes regarding sample-based data (EU BON milestone report: section on GBIF IPT and sample-based data; EU BON training workshop early April 2014: demo a test-IPT supporting the "sample" core type, using the intermediate state of terms/structure/vocabulary; agreement on "sample" core for DwC archives; audience providing sample datasets and acting as a "sounding board" for decisions)
- BioVel (consumer of sample data)

*Involvement*
OB, EOT, IT group

*Tracking sheet*
http://livelink.gbif.org/gbif/livelink/overview/4680508]]>
    
Attachment Darwin Core for sample data_v3.docx
Attachment Darwin Core for sample data_v4.docx
Attachment IPT-sample-data-primer.docx
Attachment Sample-data-demonstration-tasks-timeline.docx
Attachment sample-data-DwC-fields_v01.docx
Attachment Sample-data-task-progress-v2.docx
Attachment Sample-data-task-progress-v4.docx
Attachment Sample-data-task-progress-v5.docx
Attachment Sample-data-task-progress-v6.docx


Author: eotuama@gbif.org
Comment: Revised document expands the "standards and tools" secion on pages 5-6.
Created: 2014-02-24 12:35:05.467
Updated: 2014-02-24 12:35:05.467


Author: eotuama@gbif.org
Comment: A first draft of identifying tasks in more detail and developing a timeline for completion. This is being shared with Hannu Saarenmaa as leader of EO BON work package 2 with the aim of engaging with EU BON partners/expertise as much as possible.
Created: 2014-02-26 12:35:24.095
Updated: 2014-02-26 12:35:24.095


Author: mdoering@gbif.org
Created: 2014-03-13 11:50:47.294
Updated: 2014-03-13 11:50:47.294
        
[~eotuama@gbif.org] do we have an issue for creating a sample core yet? I would like to capture some discussion if that is possible. Thanks for the attached word docs. I've got a question about this paragraph:

----
Neither the Occurrence class nor the newly ratified MaterialSample  class are appropriate as the former refers to individual s (observations, specimens) while the latter is “The category of information pertaining to the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed”.  Thus, samples that do not remove anything from the environment cannot be placed in the MaterialSample class. The BioCollections Ontology  at the BioPortal is working on some new BCO terms to describe surveys and plot sampling but these are not yet available and it remains to be worked out how they could be used with DwC.

For the purposes of publishing sample data using the GBIF IPT, it will be necessary to define the equivalent of a new “sample” core, additional to the existing occurrence and taxon cores.
----

I think the Occurrence core could indeed be extended with the new terms to capture sample data as defined here. An occurrence does not need to be an individual at all. We would like it to be taxonomically homogenous, but we can surely add some quantity in the same way as there is already the dwc:individualCount
    


Author: eotuama@gbif.org
Created: 2014-03-13 14:12:35.558
Updated: 2014-03-13 14:12:35.558
        
As you say, occurrence need not be an individual and that is how we treat it in the sample_occurrence.xml [1] extension. I'm sure we can model between core and extension in many ways. This just seemed cleaner but would welcome arguments to contrary.

[1] http://rs.gbif.org/sandbox/extension/sample_occurrence.xml
    


Author: eotuama@gbif.org
Comment: The document (sample-data-DwC-fields_v01.docx - see link above) provides a revised version of the data model for sample data together with some worked examples. 
Created: 2014-05-21 16:55:28.081
Updated: 2014-05-21 16:57:15.685


Author: eotuama@gbif.org
Comment: This document describes how the Darwin Core vocabulary, extended with a small number of additional terms, can be used in a Darwin Core Archive to express sample-based data sets.
Created: 2014-10-10 13:27:44.062
Updated: 2014-10-10 13:27:44.062


Author: eotuama@gbif.org
Comment: This document lists tasks achieved for Jira GBIF-13 as of October and outlines next steps and major dependencies.
Created: 2014-10-13 15:14:31.689
Updated: 2014-10-13 15:14:31.689


Author: eotuama@gbif.org
Comment: The document "Sample-data-task-progress-v4.docx" provides an update on tasks completed and future work plans into 2015 including details of a phased campaign to get the GBIF nodes and wider community involved in mobilising and publishing sample based data.
Created: 2014-12-11 13:54:17.935
Updated: 2014-12-11 13:54:17.935


Author: eotuama@gbif.org
Comment: The document "Sample-data-task-progress-v5.docx" provides an update on tasks completed and future work plans into 2015 including details of a phased campaign to get the GBIF nodes and wider community involved in mobilising and publishing sample based data.
Created: 2015-02-05 10:50:35.119
Updated: 2015-02-05 10:50:35.119


Author: eotuama@gbif.org
Comment: The document "Sample-data-task-progress-v6.docx" provides an update on tasks completed and future work plans into 2015 including details of a phased campaign to get the GBIF nodes and wider community involved in mobilising and publishing sample based data. In this version, the table listing documents required for sample data and their related tasks and status has been revised. 
Created: 2015-02-06 16:39:31.432
Updated: 2015-02-06 16:39:31.432