2017-06-22 10 views
2

Windows 8にApache Spark 2.1.1をインストールしました.Anaconda 4.3.1でJava 1.8とPython 3.6をインストールしました。また、JAVA_HOMEHADOOP_HOMESPARK_HOMEの環境変数winutils.exeと環境変数をダウンロードし、パス変数を更新しました。私もwinutils.exe chmod -R 777 \tmp\hiveを実行しました。しかし、cmdのプロンプトでpysparkを実行すると、以下のエラーが表示されます。Windows 10でApache Spark 2.1.1をセットアップできません

は、誰かの助けは、私が

おかげで、事前にすべての重要な詳細を逃したなら、私に知らせてくださいすることができますしてください!私は「スパークへようこそ」の部分を取得するため、

c:\Spark>bin\pyspark 
Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information. 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
Setting default log level to "WARN". 
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 
Traceback (most recent call last): 
    File "c:\Spark\python\pyspark\sql\utils.py", line 63, in deco 
    return f(*a, **kw) 
    File "c:\Spark\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py", line 319, in get_return_value 
py4j.protocol.Py4JJavaError: **An error occurred while calling o22.sessionState. 
: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':** 
     at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981) 
     at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) 
     at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) 
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
     at java.lang.reflect.Method.invoke(Method.java:498) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 

[スパークシェル]を起動するとき、私はまだエラーが発生しますが、それはスパークのように見えますが起動します。私が手にエラーが「私はwinutils.exeを使用しなかった」次のように私のために働いた

C:\Spark>bin\spark-shell 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
Setting default log level to "WARN". 
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 
17/06/23 12:20:15 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../jars/datanucleus-api-jdo-3.2.6.jar." 
17/06/23 12:20:15 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../jars/datanucleus-rdbms-3.2.9.jar." 
17/06/23 12:20:15 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/bin/../jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/jars/datanucleus-core-3.2.10.jar." 
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState': 
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981) 
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) 
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) 
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) 
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) 
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) 
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) 
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) 
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) 
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) 
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878) 
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96) 
... 47 elided 
Caused by: java.lang.reflect.InvocationTargetException: 
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978) 
... 58 more 
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': 
at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169) 
at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86) 
at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) 
at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) 
at scala.Option.getOrElse(Option.scala:121) 
at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) 
at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) 
at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157) 
at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32) 
... 63 more 
Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Ljava/lang/String;I)V 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) 
... 71 more 
Caused by: java.lang.reflect.InvocationTargetException: 
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Ljava/lang/String;I)V 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) 
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358) 
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262) 
at org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:66) 
... 76 more 
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Ljava/lang/String;I)V 
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Native Method) 
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode(NativeIO.java:524) 
at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:478) 
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:532) 
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:509) 
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:305) 
at org.apache.hadoop.hive.ql.session.SessionState.createPath(SessionState.java:639) 
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:561) 
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) 
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188) 
... 84 more 
14: error: not found: value spark 
    import spark.implicits._ 
     ^
14: error: not found: value spark 
    import spark.sql 
     ^
Welcome to 
+0

ありがとう@Alfrabravo! – AmyJ

+0

'spark-shell'を起動するときにエラーが出るのですか、それとも' pyspark'に特有ですか? –

+0

@SamsonScharfrichter私は私の質問を更新しましたが、はい、[spark-shell]が起動するようですが、[pyspark]はありません – AmyJ

答えて

0

セットアップは次のとおりです。 -

pip3 install pyspark 
として「アナコンダコマンドプロンプト」を使用してpysparkとfindsparkをインストール

pip3 install findspark 

あなたはすでにスパーク設定をダウンロードしています。解凍してCドライブ(C:\ spark-2.2.0-bin-hadoop2.7)に保存し、新しい環境変数 "SPARK_HOME"を作成して "C:\ spark-2.2.0-bin"に設定します-hadoop2.7 \ bin "を開き、システム変数の" path "変数を開き、そこに同じ変数を追加します。 は今、あなたのコマンドプロンプトを開き、「C:\ユーザー*」から来て「C:\」のCDを実行して...二回と次のコマンドを実行します。

set SPARK_HOME='spark-2.2.0-bin-hadoop2.7' 

、あなたが行ってもいいです。 これで、jupyterノートブックにpysparkをインポートする前に、ファイルの場所をインポートするだけで済みます。次のコードを使用してください: -

import findspark 
findspark.init('C:\spark-2.2.0-bin-hadoop2.7') 
import pyspark 
関連する問題