Uploaded image for project: 'Kite SDK (READ-ONLY)'
  1. Kite SDK (READ-ONLY)
  2. KITE-1042

Issuing Kite CLI commands through Oozie shell actions fail

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: Command-line Interface
    • Labels:
      None
    • Environment:
      Environment is not as relevant for this problem, though this has specifically been encountered on RHEL 6.2, running CDH 5.1.4.

      Description

      The Kite CLI utility (kite-dataset) performs the following call to kick off the kite cli command:

      exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main --dollar-zero "$0" "$@"
      

      However, when executing this kite-dataset from within a shell script, that is being executed as a shell action – all shell actions calling the "hadoop jar" command actually need to specify the -conf option as so:

      exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main -conf ${OOZIE_ACTION_CONF_XML} --dollar-zero "$0" "$@"
      

      Although kite-dataset is flexible to allow the user to specify $flags from command line for Jar specific options, it does NOT give users the ability to place variables AFTER the class org.kitesdk.cli.Main, which is where these options are mandatory.

      I think either:

      a) kite-dataset should allow a second variable users can specify that are placed AFTER the class name like so:

         exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main $config --dollar-zero "$0" "$@"
      

      or..

      b) Actually contain logic to LOOK for that environment variable automatically, and then place it in there for users - so that oozie users don't have to think about this. Everyone will get tripped on this.

      OR...

      Both a) and b)

      I'm thinking something like this:

          OPT_OOZIE_ACTION_XML="-conf ${OOZIE_ACTION_XML_FILE}"
          if [ -z ${OOZIE_ACTION_XML_FILE} ]; then
              # If the environment variable is not set, then neither do we set this
              # value as a config variable
              OPT_OOZIE_ACTION_XML=""
          fi
          debug "OPT_OOZIE_ACTION_XML=${OPT_OOZIE_ACTION_XML}"
          ...
          exec ${HADOOP_COMMON_HOME}/bin/hadoop jar "$0" $flags org.kitesdk.cli.Main $config $OPT_OOZIE_ACTION_XML --dollar-zero "$0" "$@"
      

      There's another consideration as well for Kerberized clusters, that may need this variable set:

      -D mapreduce.job.credentials.binary=${HADOOP_TOKEN_FILE_LOCATION}
      

      But I'm getting this to work right now without this.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mladkov Mladen Kovacevic
                Reporter:
                mladkov Mladen Kovacevic
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: