Issue 11432

Partition name usage table across checklists

11432
Reporter: mdoering
Type: Improvement
Summary: Partition name usage table across checklists
Priority: Minor
Resolution: WontFix
Status: Closed
Created: 2012-06-18 10:48:09.656
Updated: 2013-12-09 13:40:54.403
Resolved: 2013-08-06 10:27:57.556
        
Description: Both imports and queries should be improved dramatically by partitioning the name usage table in checklist bank.
Nearly all queries are done within a single checklist, i.e. partition.

Only querying for related usages via a nub_fk is done across all checklists.
See http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html for how to use table inheritance to setup partitioning.]]>
    


Author: mdoering@gbif.org
Created: 2013-08-06 10:27:48.255
Updated: 2013-08-06 10:27:48.255
        
dataset based partitions dramatically speed up indexing (deletions and inserts) of checklists and searches within one checklist, but the blocker that stops partitions from being useful at all is that there is no way in postgres up to 9.3 at least to have a global index covering all partitions. Most queries are actually primary key lookups or get all children where parent_fk=x. In those queries the partition is unknown and postgres needs to do an index scan on every partition - currently ~130 checklists, but CLB should be able to grow to thousands without problems.

It seems better to do a physical index based on checklists and include the dataset_key also on the extension tables so we can easily delete an entire checklist.