Uploaded image for project: 'CDH (READ-ONLY)'
  1. CDH (READ-ONLY)
  2. DISTRO-832

Mllib in pyspark fails to execute due to missing numpy dependency

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Spark
    • Labels:
      None
    • Environment:
      Spark 2.0 beta, Spark 1.6, CentOS

      Description

      Encountered an issue which was solved by https://community.hortonworks.com/articles/49710/spark-mllib-function-fails-with-error-importerror.html

      However, this is not intuitive and not convenient, as numpy has to be installed on every cluster node

      Note, "pip install numpy" doesn't work. The only working solution is "yum install numpy"

      Credit: bug was found by Chris Lin.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              weichiu Wei-Chiu Chuang
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: