Issue 11822

C4 can become unstable with huge intermediate values

11822
Reporter: trobertson
Type: Improvement
Summary: C4 can become unstable with huge intermediate values
Priority: Major
Resolution: Fixed
Status: Closed
Created: 2012-09-10 09:25:15.829
Updated: 2013-12-05 11:15:05.509
Resolved: 2013-12-05 11:06:42.332
        
Description: Running jobs producing billions of rows from the Mapper can throw Heap errors on the reduce shuffle and sort.

Error: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1612)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1472)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1321)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1253)

Suspect we should consider lowering io.sort.factor and io.sort.mb

The mappers are producing 20 Billion emissions for reduction.

The density tile backfill produces this.]]>
    


Author: trobertson@gbif.org
Created: 2012-09-10 21:31:43.691
Updated: 2012-09-10 21:31:43.691
        
Here is where my knowledge is limited.  The mappers (216 of them) are emitting 30B records.
There are 240 reducers in total, running in 2 waves.  I am controlling the number of reducers manually.  Would shuffle and sort be less stressful if I were to have far more reducers?  Further more, I see the WriteableComparable is using a toString() comparison, which could be improved to reduce object creations in the shuffle and sort.