Issue 16688
Duplicate datasets
16688
Reporter: eotuama
Type: Improvement
Summary: Duplicate datasets
Priority: Major
Status: Open
Created: 2014-11-27 10:30:51.909
Updated: 2015-03-02 14:56:37.236
Description: Looks like the recent addition of the dataset “Geographically tagged INSDC sequences” [1] has duplicated what is in “European Molecular Biology Laboratory Australian Mirror” [2] – 118,652 records in total.
For example, these two records for Plectropomus maculatus (Bloch, 1790) (coral trout) have same coordinates and refer to same GenBank accession (JN222549)
• http://www.gbif.org/occurrence/1000073651
• http://www.gbif.org/occurrence/489001485
Both datasets each have 10 records of this species.
The recording date also looked suspect (01-Jan-2007 00:00:00) and when I went out to the parent publication it states that all fish were sampled between November 2007 and February 2008.
[1] http://www.gbif.org/dataset/ad43e954-dd79-4986-ae34-9ccdbd8bf568
[2] www.gbif.org/dataset/c1fc2df7-223b-4472-8998-70afb3b749ab
]]>