Issue 14472

Portal crashed Dec 13, 2013, around 07:00

14472
Reporter: omeyn
Type: Bug
Summary: Portal crashed Dec 13, 2013, around 07:00
Priority: Major
Resolution: CantReproduce
Status: Resolved
Created: 2013-12-13 12:35:47.009
Updated: 2015-03-02 14:36:18.007
Resolved: 2015-03-02 14:36:17.986
        
Description: Timeline:

Pingdom and Nagios alerts: 7:03
7:05: Tim got alerts and restarted tomcat and varnish. portal-web still complained about invalid drupal "stuff" so restarted apache.
7:xx Portal looked good except occ search broken. Tim restarted jetty for occ-search.
7:22: Nagios and pingdom report all back up]]>
    
Attachment portal-web_error.log.zip


Author: omeyn@gbif.org
Comment: First related error is in portal-web log at 06:38:02 showing socket timeout trying to answer an empty occurrence search query (could well have been pingdom request).
Created: 2013-12-13 14:17:16.364
Updated: 2013-12-13 14:17:16.364


Author: omeyn@gbif.org
Comment: Cacti reports that /var/local/large on jawa was filling very quickly and was pretty much full right at the crash. It cleaned itself up presumably as part of the restarts that tim did.
Created: 2013-12-13 16:12:24.966
Updated: 2013-12-13 16:12:24.966


Author: trobertson@gbif.org
Comment: Error log from tomcat attached
Created: 2013-12-13 16:15:04.212
Updated: 2013-12-13 16:15:04.212