17241 Reporter: mdoering Type: Bug Summary: Wikipedia crawl not removed from zookeeper Description: Trying to crawl the wikipedia checklist on Feb 16th left the crawl still "running" in zookeeper the next day. CRAM shows it and the coordinator cleanup thread keeps visiting it (see attached screens) Priority: Major Resolution: Fixed Status: Closed Created: 2015-02-17 11:05:43.156 Updated: 2015-02-18 18:07:53.163 Resolved: 2015-02-18 18:07:53.134
Author: mdoering@gbif.org Created: 2015-02-17 18:31:51.804 Updated: 2015-02-17 18:31:51.804 The original crawl got shot down during dwca-metasyncing. The registry produced a 500 exception when posting the EML, see POR-2657 ----- 8:INFO [2015-02-16 17:50:01,797+0100] [pool-9-thread-2] org.gbif.crawler.dwca.metasync.DwcaMetasyncService: Updating metadata from DwC-A for dataset [cbb6498e-8927-405a-916b-576d00a6289b] 15:ERROR [2015-02-16 17:50:02,586+0100] [pool-9-thread-2] org.gbif.crawler.dwca.metasync.DwcaMetasyncService: Exception caught during metasyncing DwC-A [cbb6498e-8927-405a-916b-576d00a6289b] 35:Problem accessing /dataset/cbb6498e-8927-405a-916b-576d00a6289b/document. Reason:
Author: mdoering@gbif.org Created: 2015-02-18 17:06:39.408 Updated: 2015-02-18 17:06:39.408 Apart from misconfigurations on uat clis there has been a zookeeper/curator exception fixed that lead to zookeeper never being updated by the checklistbank-cli: https://github.com/gbif/checklistbank/commit/1901d663c79e19159543d007631c4151f7d6e08b The exception seen might as well show up in crawler or occurrence cli as the code was copied from there. All curator ZK paths need to start with a / and cannot be relative: ERROR [2015-02-18 12:51:42,683+0100] [main] org.gbif.checklistbank.cli.common.ZookeeperUtils: Exception while deleting ZooKeeper path crawls/cbb6498e-8927-405a-916b-576d00a6289b java.lang.IllegalArgumentException: Path must start with / character at org.apache.curator.utils.PathUtils.validatePath(PathUtils.java:54) ~[checklistbank-cli.jar:2.11] at org.apache.curator.utils.PathUtils.validatePath(PathUtils.java:37) ~[checklistbank-cli.jar:2.11] at org.apache.curator.utils.ZKPaths.fixForNamespace(ZKPaths.java:63) ~[checklistbank-cli.jar:2.11] at org.apache.curator.framework.imps.NamespaceImpl.fixForNamespace(NamespaceImpl.java:82) ~[checklistbank-cli.jar:2.11]