Issue 16688

Duplicate datasets

16688
Reporter: eotuama
Type: Improvement
Summary: Duplicate datasets
Priority: Major
Status: Open
Created: 2014-11-27 10:30:51.909
Updated: 2015-03-02 14:56:37.236
        
Description: Looks like the recent addition of the dataset “Geographically tagged INSDC sequences” [1] has duplicated what is in “European Molecular Biology Laboratory Australian Mirror” [2] – 118,652 records in total.

For example, these two records for Plectropomus maculatus (Bloch, 1790)  (coral trout) have same coordinates and refer to same GenBank accession (JN222549)

•	http://www.gbif.org/occurrence/1000073651
•	http://www.gbif.org/occurrence/489001485

Both datasets each have 10 records of this species.

The recording date also looked suspect (01-Jan-2007 00:00:00) and when I went out to the parent publication it states that all fish were sampled between November 2007 and February 2008.

[1] http://www.gbif.org/dataset/ad43e954-dd79-4986-ae34-9ccdbd8bf568

[2] www.gbif.org/dataset/c1fc2df7-223b-4472-8998-70afb3b749ab
]]>