Issue 11468

Species search unusably slow

Reporter: trobertson
Assignee: trobertson
Type: Bug
Summary: Species search unusably slow
Priority: Blocker
Resolution: Fixed
Status: Closed
Created: 2012-06-22 14:33:12.724
Updated: 2013-08-29 14:44:34.625
Resolved: 2012-08-27 14:41:55.169
Description: Searches take20 secs. Try Animalia on a freshly started tomcat.

Could be a cache warming issue, so suggest considering this in investigations.

/etc/tomcat6/tomcat6.conf has the JAVA_OPTS to tune.

Using the following can be used to warm the OS level page cache:
 cat / > _*.* > /dev/null

I believe this drive is partitioned for reasonably large files, but check with Andrei]]>

Created: 2012-06-26 22:33:11.855
Updated: 2012-06-26 22:33:11.855
A pure search on the webservice without facets takes ~300ms from copenhagen, but outside the GBIF LAN incl a 90ms latency:

This seems pretty acceptable.
When turning on all 7 facets it still takes exactly the same time for a binomial search at least:

For a monomial search it still takes the same time:

So it appears that the slow search is actually caused by the portals use of it.
The original guess still is that looking up all facet values, namely titles for datasets, causes the slow search.
Needs to be further investigated.

Created: 2012-07-03 10:24:06.458
Updated: 2012-07-03 10:24:06.458
On a search page like this:

The portal needs to do a lookup for 35 checklist titles and 100 name usage titles, all being separate, uncached web service calls.
When this lookup is out commented its much, much faster.


Created: 2012-07-03 10:25:17.359
Updated: 2012-07-03 10:25:17.359
Possible improvements:
- cache for checklists and higher usages
- new api method that takes a list of ids and returns a list of titles only
- load all titles asynchroneously
- load the initially hidden "see all" titles only on request asynchronously

Created: 2012-07-03 10:34:54.823
Updated: 2012-07-03 10:35:13.146
The comments above are not entirely accurate - yes a load of get by IDs can add some time and be great candidates for caching, but the actual search itself is slow for a new term, and that can't be cached:

Repeated searches are immediate, presumably due to internal SOLR level caching, and might then indicate further improvements as outlined above, but the search is indeed problematic


Comment: The species search has been addressed, and the get by keys are addressed by using the caching service between the web app and the web services.  
Created: 2012-08-27 14:41:55.2
Updated: 2012-08-27 14:41:55.2