2016-11-26 10 views
1

Apache Spark Clusterを起動して、すべてのAmazon EC2 m4.largeインスタンスを使用して起動しようとしています。次のように 私のセットアップは現在、次のとおりです。Apache SparkスタンドアロンモードでPythonサンプルアプリケーションを送信中にエラーが発生しました

  • ドライバEC2インスタンス
  • スパークは、私は、ドライバ上でスパーク2.0.2をインストールして、開発した
  • 二スパーク奴隷

をマスター私がダウンロードしてインストールしたspark-ec2スクリプトを使用してクラスタを作成しました。https://github.com/amplab/spark-ec2/tree/branch-2.0

私の魔法使いに私のSpark MasterのWeb UIが見えますそれに2つのスレーブノードがあります。

だから今、私が提案したようhere次のコマンドを使用して、私の$ SPARK_HOMEから例の応用例/ srcに/メイン/パイソン/ pi.pyを提出しようとすると:

./bin/spark-submit --master spark://ip-172-31-43-158.ec2.internal:7077 examples/src/main/python/pi.py 1000 

私は次のエラーを取得します:

[[email protected] spark]$ ./bin/spark-submit --master spark://ip-172-31-43-158.ec2.internal:7077 examples/src/main/python/pi.py 1000 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
16/11/26 01:37:59 INFO SparkContext: Running Spark version 2.0.2 
16/11/26 01:37:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/11/26 01:37:59 INFO SecurityManager: Changing view acls to: ec2-user 
16/11/26 01:37:59 INFO SecurityManager: Changing modify acls to: ec2-user 
16/11/26 01:37:59 INFO SecurityManager: Changing view acls groups to: 
16/11/26 01:37:59 INFO SecurityManager: Changing modify acls groups to: 
16/11/26 01:37:59 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ec2-user); groups with view permissions: Set(); users with modify permissions: Set(ec2-user); groups with modify permissions: Set() 
16/11/26 01:38:00 INFO Utils: Successfully started service 'sparkDriver' on port 37830. 
16/11/26 01:38:00 INFO SparkEnv: Registering MapOutputTracker 
16/11/26 01:38:00 INFO SparkEnv: Registering BlockManagerMaster 
16/11/26 01:38:00 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-f4b273d5-1d43-4e19-859a-3fb88fe527ff 
16/11/26 01:38:00 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 
16/11/26 01:38:00 INFO SparkEnv: Registering OutputCommitCoordinator 
16/11/26 01:38:00 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/11/26 01:38:00 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.31.48.106:4040 
16/11/26 01:38:00 INFO SparkContext: Added file file:/opt/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py at spark://172.31.48.106:37830/files/pi.py with timestamp 1480124280871 
16/11/26 01:38:00 INFO Utils: Copying /opt/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py to /tmp/spark-e16c0211-227d-45bb-866c-b38c84bd3344/userFiles-5e86cb15-a858-4421-9005-0d04364d2c68/pi.py 
16/11/26 01:38:00 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-172-31-43-158.ec2.internal:7077... 
16/11/26 01:38:20 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-172-31-43-158.ec2.internal:7077... 
16/11/26 01:38:40 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-172-31-43-158.ec2.internal:7077... 
16/11/26 01:39:00 ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. 

16/11/26 01:39:00 WARN StandaloneSchedulerBackend: Application ID is not initialized yet. 
16/11/26 01:39:01 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44219. 
16/11/26 01:39:01 INFO NettyBlockTransferService: Server created on 172.31.48.106:44219 
16/11/26 01:39:01 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.31.48.106, 44219) 
16/11/26 01:39:01 INFO BlockManagerMasterEndpoint: Registering block manager 172.31.48.106:44219 with 366.3 MB RAM, BlockManagerId(driver, 172.31.48.106, 44219) 
16/11/26 01:39:01 INFO SparkUI: Stopped Spark web UI at http://172.31.48.106:4040 
16/11/26 01:39:01 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.31.48.106, 44219) 
16/11/26 01:39:01 INFO StandaloneSchedulerBackend: Shutting down all executors 
16/11/26 01:39:01 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 
16/11/26 01:39:01 WARN StandaloneAppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master 
16/11/26 01:39:01 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
16/11/26 01:39:01 ERROR MapOutputTrackerMaster: Error communicating with MapOutputTracker 
java.lang.InterruptedException 
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1038) 
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) 
     at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208) 
     at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218) 
     at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) 
     at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) 
     at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) 
     at scala.concurrent.Await$.result(package.scala:190) 
     at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81) 
     at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) 
     at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78) 
     at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:100) 
     at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:110) 
     at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:580) 
     at org.apache.spark.SparkEnv.stop(SparkEnv.scala:84) 
     at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797) 
     at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290) 
     at org.apache.spark.SparkContext.stop(SparkContext.scala:1796) 
     at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.dead(StandaloneSchedulerBackend.scala:142) 
     at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint.markDead(StandaloneAppClient.scala:254) 
     at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$2.run(StandaloneAppClient.scala:131) 
     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) 
     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
16/11/26 01:39:01 ERROR Utils: Uncaught exception in thread appclient-registration-retry-thread 
org.apache.spark.SparkException: Error communicating with MapOutputTracker 
     at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:104) 
     at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:110) 
     at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:580) 
     at org.apache.spark.SparkEnv.stop(SparkEnv.scala:84) 
     at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797) 
     at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290) 
     at org.apache.spark.SparkContext.stop(SparkContext.scala:1796) 
     at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.dead(StandaloneSchedulerBackend.scala:142) 
     at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint.markDead(StandaloneAppClient.scala:254) 
     at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$2.run(StandaloneAppClient.scala:131) 
     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) 
     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.InterruptedException 
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1038) 
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) 
     at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208) 
     at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218) 
     at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) 
     at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) 
     at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) 
     at scala.concurrent.Await$.result(package.scala:190) 
     at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81) 
     at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) 
     at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78) 
     at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:100) 
     ... 16 more 
16/11/26 01:39:01 INFO SparkContext: Successfully stopped SparkContext 
16/11/26 01:39:01 ERROR SparkContext: Error initializing SparkContext. 
java.lang.NullPointerException 
     at org.apache.spark.SparkContext.<init>(SparkContext.scala:546) 
     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
     at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 
     at py4j.Gateway.invoke(Gateway.java:236) 
     at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) 
     at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) 
     at py4j.GatewayConnection.run(GatewayConnection.java:214) 
     at java.lang.Thread.run(Thread.java:745) 
16/11/26 01:39:01 INFO SparkContext: SparkContext already stopped. 
Traceback (most recent call last): 
    File "/opt/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py", line 32, in <module> 
    .appName("PythonPi")\ 
    File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 169, in getOrCreate 
    File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 294, in getOrCreate 
    File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__ 
    File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 168, in _do_init 
    File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 233, in _initialize_context 
    File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1401, in __call__ 
    File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value 
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. 
: java.lang.NullPointerException 
     at org.apache.spark.SparkContext.<init>(SparkContext.scala:546) 
     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
     at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 
     at py4j.Gateway.invoke(Gateway.java:236) 
     at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) 
     at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) 
     at py4j.GatewayConnection.run(GatewayConnection.java:214) 
     at java.lang.Thread.run(Thread.java:745) 

16/11/26 01:39:01 INFO DiskBlockManager: Shutdown hook called 
16/11/26 01:39:01 INFO ShutdownHookManager: Shutdown hook called 
16/11/26 01:39:01 INFO ShutdownHookManager: Deleting directory /tmp/spark-e16c0211-227d-45bb-866c-b38c84bd3344/userFiles-5e86cb15-a858-4421-9005-0d04364d2c68 
16/11/26 01:39:01 INFO ShutdownHookManager: Deleting directory /tmp/spark-e16c0211-227d-45bb-866c-b38c84bd3344 

Thisは、私のSpark MasterのWeb UIのスクリーンショットです。

私が間違っているポインタはありますか? 私は本当にこれについていくつかの助けに感謝します!

答えて

0

さて、私はばかげていました。ドライバ自体から​​を実行しようとしていました。私はSSH(hereと示唆されているように)を介してマスターにログインし、コマンドを実行することになっていました。

  • 質問
に述べたように、あなたが
  • が続い​​コマンドを実行するクラスターにSSHで接続する
  • 実行./spark-ec2 -k <keypair> -i <key-file> login <cluster-name>をダウンロードしたスパークのリリースでは、EC2のディレクトリに移動します
  • 関連する問題