Details
Description
Referring to this page:
https://docs.cloudera.com/display/DOC/Hadoop+Upgrade+from+CDH2+or+CDH3b2+to+CDH3b3
While there is mention of the new hdfs/mapred users towards the middle of the doc, it would be extremely beneficial to start the document with a conceptual overview of what will happen during the install, and how the world is different pre- vs. post- kerberos patch. Something along the lines of:
1. Security means separating hdfs/mapred user.
2. Old hadoop renamed to hdfs for running NN/SNN/DN and also acts as the HDFS superuser.
3. New mapred user for running JT/TT.
4. There is a hadoop group to which mapred and hdfs belong.
Also worth mentioning some things to watch out for:
1. Directory ownership and permissions may need to change. (I don't have a full inventory of these, but often 777 or 775 becomes 755 on mapred.local.dir, dfs.data.dir, etc.)
2. You may have to 'hadoop fs -chown' (and/or -chgrp, -chmod) a number of things in HDFS. For example, mapred.system.dir may, after the upgrade, still be owned by hadoop:supergroup when it should be owned by mapred:hadoop.
3. I'm sure there's a few other corner cases like these.