Issue 18162

Offline Dwc-A datasets

18162
Reporter: mdoering
Assignee: jlegind
Type: Bug
Summary: Offline Dwc-A datasets
Priority: Critical
Status: Open
Created: 2016-01-18 16:46:13.618
Updated: 2016-01-18 16:49:07.991
        
Description: While crawling all checklists I found 25 datasets being entirely offline for at least 3 days. As these are single dwca zip files it is very likely we have badly registered datasets.

-----

Downloading http://borneanlandsnails.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/6a9ae570-99f2-4bda-9312-979c449e45b5.dwca failed!: 429
Downloading http://parmotrema.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/306af523-20d5-48b4-a0f0-c67682cbec0b.dwca failed!: 429
Downloading http://batrach.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/ce8956d0-fdad-40fe-b591-956a7633771d.dwca failed!: 429
Downloading http://alpheidae.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/1fa870c2-e36b-4046-b70c-081560c95bf5.dwca failed!: 429
Downloading http://agrilus.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/0ddf40b0-067c-4114-8d3e-24782468177f.dwca failed!: 429
Downloading http://arachnids.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/b040c606-644f-4f6b-8f43-124c2a00d345.dwca failed!: 429
Downloading http://plazi.cs.umb.edu/GgServer/dwca/8F6095152972FFDB812A943AFF9EFFE7.zip to /home/crap/storage/dwca/c4b5ac92-bf73-4f08-8377-b40b22923281.dwca failed!: 404
Downloading http://eolhotlist.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/9d015991-438a-4093-a8d2-9a8006492121.dwca failed!: 429
Downloading http://annelida.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/3c37ebc6-d3d8-496c-96e8-55a6088aaecb.dwca failed!: 429
Downloading http://apoidea.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/0e35c8a8-d77a-4de5-92e9-e87166ed028a.dwca failed!: 429
Downloading http://cnidaria.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/a5358a8d-9b1a-43a3-b769-50d3e6ef8bd9.dwca failed!: 429
Downloading http://avesamericanas.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/3e9a9493-47e4-4dc9-a73a-00c23156b100.dwca failed!: 429
Downloading http://neotropicalfishes.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/9fe7c02c-a45e-4f31-afca-4840ead2d62a.dwca failed!: 429
Downloading http://echinoderms.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/2599d15f-58e1-4766-bbaf-93344bd09bb6.dwca failed!: 429
Downloading http://plazi.cs.umb.edu/GgServer/dwca/CD172025FFE5FFD3FFA5230590548268.zip to /home/crap/storage/dwca/f22c0674-1d59-40c0-b0b8-168ae8e778de.dwca failed!: 404
Downloading http://worms.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/ece71924-278a-4e34-910e-a5caa52940ef.dwca failed!: 429
Downloading http://anolislizards.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/094ef86a-bd6e-4a3e-8dd1-f35c8c69b8a8.dwca failed!: 429
Downloading http://britishbryozoans.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/02c23566-1d5b-4f0c-9d2f-07f3ae24381b.dwca failed!: 429
Downloading http://terrslugs.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/0eb347bd-930c-4088-8eb3-1ace20c3e7ec.dwca failed!: 429
Downloading http://sacoglossa.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/7dbb1785-2e54-46f0-a823-7673b06c729b.dwca failed!: 429
Downloading http://florakorea.myspecies.info/en/gbif-dwca.zip to /home/crap/storage/dwca/5de20c3a-7d23-442a-a34e-36ae24d0f8e3.dwca failed!: 429
Downloading http://syrphidae.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/c25efcf4-fe66-4530-b02e-8c218c09e690.dwca failed!: 429
Downloading http://dev.gbif.no/ipt/archive.do?r=artsnavn to /home/crap/storage/dwca/a6c6cead-b5ce-4a4e-8cf5-1542ba708dec.dwca failed!: 503
Downloading http://www.catalogueoflife.org/DCA_Export/zip/archive-complete.zip to /home/crap/storage/dwca/7ddf754f-d193-4cc9-b351-99906754a03b.dwca failed!: 404
Downloading http://scarabaeinae.myspecies.info/gbif-dwca.zip to /home/crap/storage/dwca/c1c3efdd-0628-4939-9101-8ae7c843b39a.dwca failed!: 429
]]>
    


Author: mdoering@gbif.org
Created: 2016-01-18 16:48:13.523
Updated: 2016-01-18 16:49:07.988
        
lots of them respond with 429 which is "too many requests". Looks like we are overloading the scraptchpads...

which just leaves 4:
-----
Downloading http://www.catalogueoflife.org/DCA_Export/zip/archive-complete.zip to /home/crap/storage/dwca/7ddf754f-d193-4cc9-b351-99906754a03b.dwca failed!: 404
Downloading http://plazi.cs.umb.edu/GgServer/dwca/8F6095152972FFDB812A943AFF9EFFE7.zip to /home/crap/storage/dwca/c4b5ac92-bf73-4f08-8377-b40b22923281.dwca failed!: 404
Downloading http://plazi.cs.umb.edu/GgServer/dwca/CD172025FFE5FFD3FFA5230590548268.zip to /home/crap/storage/dwca/f22c0674-1d59-40c0-b0b8-168ae8e778de.dwca failed!: 404
Downloading http://dev.gbif.no/ipt/archive.do?r=artsnavn to /home/crap/storage/dwca/a6c6cead-b5ce-4a4e-8cf5-1542ba708dec.dwca failed!: 503