Uploaded image for project: 'Kite SDK (READ-ONLY)'
  1. Kite SDK (READ-ONLY)
  2. KITE-1168

kite-dataset fails on Mac OS X due to case insensitive filesystem while unpacking the JAR


    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.1.0
    • Fix Version/s: None
    • Component/s: Build and Deployment
    • Labels:
    • Environment:
      Max OS X Sierra 10.12.5


      I already reported this on the GitHub issue tracker as issue #475 but I'm not sure if I was supposed to write it here instead. Feel free to disregard this issue if it will be handled on the GitHub tracker instead.

      The kite-tools-1.1.0-binary.jar will fail in Mac OS X since the HFS+ filesystem is case-insensitive and the jar contains META-INF/LICENSE and META-INF/license. The HFS+ by default doesn't not allow two filenames that only differ in case, it's case preserving but case insensitive.

      You can verify that the JAR indeed contains a license and LICENSE with the command jar tvf kite-tools-1.1.0-binary.jar |grep -i license

      This filename clash / conflict renders it unusable since when Hadoop tries to unpack the JAR will throw and IOException: Mkdirs failed to create <tmpdir>.../hadoop-unjar/.../META-INF/license:

      kite-dataset csv-schema movies.csv --record-name Movie                                                                                                                     
      /Users/ecerulm/bin/kite-dataset debug: Using HADOOP_COMMON_HOME=/Users/ecerulm/.local/stow/hadoop-2.8.1/
      /Users/ecerulm/bin/kite-dataset debug: Using HADOOP_MAPRED_HOME=/Users/ecerulm/.local/stow/hadoop-2.8.1//../hadoop-mapreduce
      /Users/ecerulm/bin/kite-dataset debug: Using HBASE_HOME=/Users/ecerulm/.local/stow/hadoop-2.8.1//../hbase
      /Users/ecerulm/bin/kite-dataset debug: Using HIVE_HOME=/Users/ecerulm/.local/stow/hadoop-2.8.1//../hive
      /Users/ecerulm/bin/kite-dataset debug: Using HIVE_CONF_DIR=/Users/ecerulm/.local/stow/hadoop-2.8.1//../hive/conf
      /Users/ecerulm/bin/kite-dataset debug: Using HADOOP_CLASSPATH=/Users/ecerulm/bin/kite-dataset::
      Exception in thread "main" java.io.IOException: Mkdirs failed to create /var/folders/j5/8yjty44917v3_ydfjyy0gz0c0000gn/T/hadoop-unjar7609709732056315890/META-INF/license
      	at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:140)
      	at org.apache.hadoop.util.RunJar.unJar(RunJar.java:109)
      	at org.apache.hadoop.util.RunJar.unJar(RunJar.java:85)
      	at org.apache.hadoop.util.RunJar.run(RunJar.java:222)
      	at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

      Is it possible to change the JAR build process to rename the META-INF/license dir to META-INF/licenses?Googling around I found the Maven [ApacheLicenseResourceTransformer])(https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ApacheLicenseResourceTransformer) may solve the problem.

      Alternatively, maybe move or rename the META-INF/LICENSE (Jackson JSON processor license) so it does not conflict with the dir META-INF/license/

      Are any of these alternatives possible?, otherwise kite-dataset cannot be used (as far as I understand) on Mac OS X.




            • Assignee:
              ecerulm Ruben Laguna
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: