17828
Reporter: mdoering
Assignee: mdoering
Type: Feedback
Summary: clb solr indexing throws "temporary file size exceeds temp_file_limit 52GB"
Priority: Critical
Resolution: Fixed
Status: Closed
Created: 2015-09-18 13:57:36.204
Updated: 2015-09-18 16:54:52.336
Resolved: 2015-09-18 16:54:52.277
Description: Exception thrown when rebuilding a clb solr index on uat.
prod/uat postgres is configured with a maximum of 52GB temporary files.
That should really be enough...
{noformat}
INFO [2015-09-18 13:50:51,658+0200] [pool-2-thread-5] org.gbif.checklistbank.index.NameUsageIndexingJob: Adding usages from id 113595569 to 113605568
ERROR [2015-09-18 13:50:53,232+0200] [pool-2-thread-1] org.gbif.checklistbank.index.NameUsageIndexingJob: Error indexing document for usage 100151407
org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: org.postgresql.util.PSQLException: ERROR: temporary file size exceeds temp_file_limit (52428800kB)
### The error may exist in org/gbif/checklistbank/service/mybatis/mapper/UsageMapper.xml
### The error may involve org.gbif.checklistbank.service.mybatis.mapper.UsageMapper.listParents-Inline
### The error occurred while setting parameters
### SQL: WITH RECURSIVE a AS ( SELECT id, parent_fk FROM name_usage WHERE id = ? UNION ALL SELECT p.id, p.parent_fk FROM a JOIN name_usage p ON a.parent_fk = p.id ) SELECT id FROM a WHERE id!=? LIMIT 100;
### Cause: org.postgresql.util.PSQLException: ERROR: temporary file size exceeds temp_file_limit (52428800kB)
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30) ~[checklistbank-solr.jar:2.24]
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:122) ~[checklistbank-solr.jar:2.24]
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:113) ~[checklistbank-solr.jar:2.24]
at sun.reflect.GeneratedMethodAccessor58.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_45]
at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_45]
at org.apache.ibatis.session.SqlSessionManager$SqlSessionInterceptor.invoke(SqlSessionManager.java:341) ~[checklistbank-solr.jar:2.24]
at com.sun.proxy.$Proxy26.selectList(Unknown Source) ~[na:na]
at org.apache.ibatis.session.SqlSessionManager.selectList(SqlSessionManager.java:190) ~[checklistbank-solr.jar:2.24]
at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:122) ~[checklistbank-solr.jar:2.24]
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:64) ~[checklistbank-solr.jar:2.24]
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:53) ~[checklistbank-solr.jar:2.24]
at com.sun.proxy.$Proxy28.listParents(Unknown Source) ~[na:na]
at org.gbif.checklistbank.service.mybatis.UsageServiceMyBatis.listParents(UsageServiceMyBatis.java:66) ~[checklistbank-solr.jar:2.24]
at org.gbif.checklistbank.index.NameUsageIndexingJob.call(NameUsageIndexingJob.java:117) [checklistbank-solr.jar:2.24]
at org.gbif.checklistbank.index.NameUsageIndexingJob.call(NameUsageIndexingJob.java:28) [checklistbank-solr.jar:2.24]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_45]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
Caused by: org.postgresql.util.PSQLException: ERROR: temporary file size exceeds temp_file_limit (52428800kB)
{noformat}]]>
Author: mdoering@gbif.org
Comment: suggest to increase the work_mem to 256MB which is used for sorting, hashing and subqueries
Created: 2015-09-18 14:54:08.748
Updated: 2015-09-18 14:54:08.748
Author: mdoering@gbif.org
Created: 2015-09-18 14:55:28.975
Updated: 2015-09-18 14:55:28.975
An troublesome example query is this recursive one:
{noformat}
EXPLAIN ANALYZE
WITH RECURSIVE a AS (
SELECT id, parent_fk
FROM name_usage
WHERE id = 100151407
UNION ALL
SELECT p.id, p.parent_fk
FROM a JOIN name_usage p ON a.parent_fk = p.id
)
SELECT id FROM a WHERE id!=100151407 LIMIT 100;
{noformat}
It turns out the parent_fk points to itself, so it appears we end up in some endless loop that is not caught by the limit 100:
{noformat}
checklistbank=> select id,parent_fk from name_usage where id=100151407;
id | parent_fk
-----------+-----------
100151407 | 100151407
{noformat}
Author: mdoering@gbif.org
Created: 2015-09-18 14:59:43.387
Updated: 2015-09-18 14:59:43.387
There are only 2 records like this in clb:
{noformat}
checklistbank=> select dataset_key,id from name_usage where parent_fk=id;
dataset_key | id
--------------------------------------+-----------
aacd816d-662c-49d2-ad1a-97e66e2a2908 | 113689859
52a423d2-0486-4e77-bcee-6350d708d6ff | 100151407
{noformat}
That is from the Brazilian Flora Checklist and Prokaryotic Nomenclature Up-to-date respectively.
Author: mdoering@gbif.org
Comment: Fixing those manually for now in the db by setting the parent_fk to null
Created: 2015-09-18 15:00:09.794
Updated: 2015-09-18 15:00:09.794
Author: mdoering@gbif.org
Comment: Both cases were created by the checklist normalizer when cleaning up synonym relations. This is fixed in code by making sure the accepted and parent are not the same: https://github.com/gbif/checklistbank/commit/84ef264d507b843908c8a6d8e7c6520eb5a3cdc8
Created: 2015-09-18 15:11:04.087
Updated: 2015-09-18 15:50:49.701
Author: mdoering@gbif.org
Comment: as various postgres tune sites recommend 25% of available memory to be used for shared_buffers I suggest to increase that setting too to 5GB
Created: 2015-09-18 15:53:31.386
Updated: 2015-09-18 15:54:40.593