Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.7.0
    • Fix Version/s: 3.12.0
    • Component/s: con.oozie
    • Labels:
      None

      Description

      The Spark action gives me a ".." button for all fields, which brings up a file browser, but only 3 of the 4 fields actually refer to a file (Jars/py files).

      also, I assume this supports both Spark Standalone and YARN? If so, the possible options for this field are:
      the literal "yarn"; the literal "local"; the literal "local[*]"; local[n] where n is an integer; or the URL of the Spark Standalone master, which is something like spark://localhost....
      A drop down list would be good here. Even better, a radio button saying "What cluster type?" where the user can select "Local", "YARN" or "Spark Standalone". If they pick Local, give them the option of entering threads or * for all; if YARN let them select client or cluster mode; if Spark Standalone, let them enter hostname and port.

      BUT...that said, I don't know how "local" would actually work here. Would it attempt to run locally on the machine the Hue server is running on? The Oozie server? The client machine (my laptop)? I'm not sure the local option is really relevant to Hue.

      Wish list item: It would be a very nice feature if the user could enter the Jar/py file name, and rather than having to enter the class name, have the UI read the JAR and show them available classes. Also, I don't think "class" is applicable for py files.

      And I'm not clear what spark-submit options correspond to the "Jars/py files" field. When doing this from the command line, the main JAR file is provided as an argument to the script, and any OTHER Jar files needed are added with the --jars option. Maybe there should be a "Main Jar/py file" field, and then a "Files+" button to add additional ones, similar to the one on the HiveServer2 action.

      –master yarn-client \
      –class com.vw.hy.classname \
      –properties-file /etc/path/spark.conf \
      –files /etc/path/log4j.properties \
      –conf “spark.executor.extraJavaOptions=-Dconfig.resource=application.conf -Dlog4j.configuration=log4j.properties” \
      –conf “spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties” \
      –driver-java-options -Dconfig.file=/etc/path/application.conf \
      /opt/vw/path/to/jar/main.jar
      when i create oozie workflow, where do i put these options like conf, driver-java-options and files…i couldnt find where to put that 
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                romain Romain Rigaux
                Reporter:
                romain Romain Rigaux
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: