Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.2
    • Fix Version/s: 0.2
    • Component/s: Tests
    • Labels:
      None

      Description

      Current the mini cluster cannot run pyspark not sparkr. It uses a fake Spark installation based on maven dependencies, and those do not include the necessary packages to run those (since those packages are not found in maven).

      We need to figure out a way to run these kinds of tests, though. A couple of options:

      • Require users to set a "REAL_SPARK_HOME" env variable, and only run those tests when the variable is set.
      • Download a Spark tarball to set up a proper Spark installation.

      The first approach will not help with CI tests on github, while the second would download a large file every time you cleaned up your build directory... so I'm kinda leaning towards the first one, since we can set up an internal jenkins job with a proper Spark install.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vanzin Marcelo Vanzin
                Reporter:
                vanzin Marcelo Vanzin
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: