Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Not A Bug
    • Affects Version/s: 3.9.0
    • Fix Version/s: None
    • Component/s: con.oozie, con.spark
    • Environment:

      10 nodes cluster of
      Cloudera Enterprise 5.5.1 (#8 built by jenkins on 20151201-1822 git: 2a7dfe22d921bef89c7ee3c2981cb4c1dc43de7b)
      Each with 16GB memory.

      Description

      Spark App Workflow created on Hue cannot save to HDFS through DataFrame API.
      The issue was found when trying to save NaiveBayesModel through MLlib API.
      OutOfMemoryError was obtained from the log, and then after Hue refresh the log page,
      the log can no longer be found.
      It was found that the model saving function contains code to use "write.parquet" of the DataFrame API.
      Three scripts were then created to test the DataFrame save functions using "write.save", "write.json" and "write.parquet". The scripts were built using Scale IDE installed in cloudera 5.5.0 quick-start vm using scala 2.10.6 and the cloudera Java 7 JDK in the vm.
      The apps run properly when triggered in console using spark-submit, but OutOfMemoryError were obtained when the workflows were created on Hue (both in the quick start vm and the real cluster).
      It was then tested to run the app with larger executor memory, and was found that only delay the occurrence of the OutOfMemoryError (There were more heartbeat in the stdout log)

        Attachments

        1. SaveDataFrameAsJsonTest.scala
          3 kB
        2. SaveDataFrameAsParquetTest.scala
          3 kB
        3. SaveDataFrameTest.scala
          3 kB
        4. stderr.png
          stderr.png
          17 kB
        5. stdout.png
          stdout.png
          23 kB
        6. syslog.png
          syslog.png
          55 kB

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              hake Kevin Ha
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: