Details
Description
When following the tuning advice on http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_yarn_tuning.html, particularly the tuning spreadsheet linked from there, we came across the properties mapreduce.map.java.opts.max.heap and mapreduce.reduce.java.opts.max.heap.
We used those properties and the recommended values from the spreadsheet to configure our cluster via the files in /etc/hadoop/conf, e.g., mapred-site.xml, yarn-site.xml etc.
After experiencing memory issues (GC overhead limit exceeded etc.), we figured out when checking the actual command lines of the JVMs running our MRv2 tasks that the properties do not have any effect (no -Xmx... on the command lines). When we put the equivalent setting -Xmx... directly into mapreduce.map.java.opts, mapreduce.reduce.java.opts that solved the problem.
Since the ...max.heap properties can be found nowhere in the Cloudera distribution except in some JavaScript code in the Cloudera Manager, I assume that the property mapreduce.map.java.opts.max.heap=... is actually transformed into mapreduce.map.java.opts=-Xmx... in the Cloudera Manager frontend (same for ...reduce...).
If so, it should be emphasized in the documentation which properties are valid configuration properties of the Hadoop framework itself, and which properties are only present in Cloudera Manager.