Details
-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 1.1.0
-
Fix Version/s: 1.2.0
-
Component/s: Command-line Interface
-
Labels:None
-
Environment:Environment is not as relevant for this problem, though this has specifically been encountered on RHEL 6.2, running CDH 5.1.4.
Description
The Kite CLI utility (kite-dataset) performs the following call to kick off the kite cli command:
exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main --dollar-zero "$0" "$@"
However, when executing this kite-dataset from within a shell script, that is being executed as a shell action – all shell actions calling the "hadoop jar" command actually need to specify the -conf option as so:
exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main -conf ${OOZIE_ACTION_CONF_XML} --dollar-zero "$0" "$@"
Although kite-dataset is flexible to allow the user to specify $flags from command line for Jar specific options, it does NOT give users the ability to place variables AFTER the class org.kitesdk.cli.Main, which is where these options are mandatory.
I think either:
a) kite-dataset should allow a second variable users can specify that are placed AFTER the class name like so:
exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main $config --dollar-zero "$0" "$@"
or..
b) Actually contain logic to LOOK for that environment variable automatically, and then place it in there for users - so that oozie users don't have to think about this. Everyone will get tripped on this.
OR...
Both a) and b)
I'm thinking something like this:
OPT_OOZIE_ACTION_XML="-conf ${OOZIE_ACTION_XML_FILE}" if [ -z ${OOZIE_ACTION_XML_FILE} ]; then # If the environment variable is not set, then neither do we set this # value as a config variable OPT_OOZIE_ACTION_XML="" fi debug "OPT_OOZIE_ACTION_XML=${OPT_OOZIE_ACTION_XML}" ... exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main $config $OPT_OOZIE_ACTION_XML --dollar-zero "$0" "$@"
There's another consideration as well for Kerberized clusters, that may need this variable set:
-D mapreduce.job.credentials.binary=${HADOOP_TOKEN_FILE_LOCATION}
But I'm getting this to work right now without this.
Attachments
Issue Links
- links to