Issue 18108

Bad species data in BDJ archive

18108
Reporter: mdoering
Type: Bug
Summary: Bad species data in BDJ archive
Status: Open
Created: 2015-12-22 11:01:25.532
Updated: 2015-12-22 11:21:14.663
        
        
Description: There are several species inside this dataset that have kingdom names. Isn't the data in Pensoft supposed to be reviewed?

http://www.gbif.org/dataset/bd0a2b6d-69d1-4650-8bb1-829c8f92035f]]>
    


Author: rdmpage
Created: 2015-12-22 11:21:14.663
Updated: 2015-12-22 11:21:14.663
        
[~mdoering@gbif.org] I guess the question is "what does it mean to review data?" I've always though that data can't be adequately reviewed unless people actually use it. I think it would be great if we could do something like this:

0. Post data to github
1. Pass data through a validator that has a corresponding badge on the dataset README flagging any problems (in the same way we have build reports from Travis, etc.)
2. Data is worked on by reviewers (both pre and post publication)
3. Data pushed to GBIF, maybe straightaway, but with flags indicating extent of review (e.g., automated tests, how many issues raised, etc.)
4. People can fork data, clean and fix, issue pull requests and get the data reloaded.
5. Original publishers can also issue pull requests if they want to update the data

I think there's an unfounded notion that "data publishing" means the data has been reviewed. I think this rarely happens. We don't have a real data journal yet.