The Kite CLI utility (kite-dataset) performs the following call to kick off the kite cli command:
However, when executing this kite-dataset from within a shell script, that is being executed as a shell action – all shell actions calling the "hadoop jar" command actually need to specify the -conf option as so:
Although kite-dataset is flexible to allow the user to specify $flags from command line for Jar specific options, it does NOT give users the ability to place variables AFTER the class org.kitesdk.cli.Main, which is where these options are mandatory.
I think either:
a) kite-dataset should allow a second variable users can specify that are placed AFTER the class name like so:
b) Actually contain logic to LOOK for that environment variable automatically, and then place it in there for users - so that oozie users don't have to think about this. Everyone will get tripped on this.
Both a) and b)
I'm thinking something like this:
There's another consideration as well for Kerberized clusters, that may need this variable set:
But I'm getting this to work right now without this.