2017-07-05 1 views
0

ラズベリーパイとメインデスクトップを使用して小さな3ノードスパーククラスタをセットアップしようとしていますが、Piをマスターノード(デスクトップ) 。 3つのノードすべてでCassandra(オープンソースではなくDSE)を実行しているので、ネットワークが正しく設定されています。私がウェブUIに行くと、それは私のメインコンピュータのみを表示します。私は各ワーカーノードのWeb UIアドレスを入力し、それぞれのWeb UIページを取得することができます。彼らは私のマスターノードについて知っているようではありません。私はslavesファイルにそれぞれのスレーブノードを持っています。私はこれを動作させるためにちょうど1つの小さなものが欠けているように感じる。どんな提案も大歓迎です。以下に、いくつかのログと、このことをかなり短く簡潔に保つために役立つと思われるその他の情報を示します。 SparkワーカーノードがWebUIで開始されているが表示されていない

マイ(ローカルIPが適切に調整されることを除き)ワーカーノードから

export SPARK_WORKER_CORES=6 
export SPARK_MASTER_HOST=192.168.0.106 

export SPARK_LOCAL_IP=192.168.0.201 

ログを次のようにあるすべてのノードにspark-env.sh

Spark Command: /usr/lib/jvm/jdk-8-oracle-arm32-vfp-hflt/jre/bin/java -cp /home/spark/spark/conf/:/home/spark/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://Palehorse:7077 
======================================== 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
17/07/05 03:22:40 INFO Worker: Started daemon with process name: [email protected] 
17/07/05 03:22:40 INFO SignalUtils: Registered signal handler for TERM 
17/07/05 03:22:40 INFO SignalUtils: Registered signal handler for HUP 
17/07/05 03:22:40 INFO SignalUtils: Registered signal handler for INT 
17/07/05 03:22:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/07/05 03:22:42 INFO SecurityManager: Changing view acls to: spark 
17/07/05 03:22:42 INFO SecurityManager: Changing modify acls to: spark 
17/07/05 03:22:42 INFO SecurityManager: Changing view acls groups to: 
17/07/05 03:22:42 INFO SecurityManager: Changing modify acls groups to: 
17/07/05 03:22:42 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); groups with view permissions: Set(); users with modify permissions: Set(spark); groups with modify permissions: Set() 
17/07/05 03:22:43 INFO Utils: Successfully started service 'sparkWorker' on port 35342. 
17/07/05 03:22:44 INFO Worker: Starting Spark worker 192.168.0.201:35342 with 6 cores, 1024.0 MB RAM 
17/07/05 03:22:44 INFO Worker: Running Spark version 2.1.1 
17/07/05 03:22:44 INFO Worker: Spark home: /home/spark/spark 
17/07/05 03:22:45 INFO Utils: Successfully started service 'WorkerUI' on port 8081. 
17/07/05 03:22:45 INFO WorkerWebUI: Bound WorkerWebUI to 192.168.0.201, and started at http://192.168.0.201:8081 
17/07/05 03:22:45 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:22:51 INFO Worker: Retrying connection to master (attempt # 1) 
17/07/05 03:22:51 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:22:57 INFO Worker: Retrying connection to master (attempt # 2) 
17/07/05 03:22:57 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:23:03 INFO Worker: Retrying connection to master (attempt # 3) 
17/07/05 03:23:03 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:23:09 INFO Worker: Retrying connection to master (attempt # 4) 
17/07/05 03:23:09 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:23:15 INFO Worker: Retrying connection to master (attempt # 5) 
17/07/05 03:23:15 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:23:21 INFO Worker: Retrying connection to master (attempt # 6) 
17/07/05 03:23:21 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:23:57 INFO Worker: Retrying connection to master (attempt # 7) 
17/07/05 03:23:57 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:24:33 INFO Worker: Retrying connection to master (attempt # 8) 
17/07/05 03:24:33 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:24:45 ERROR RpcOutboxMessage: Ask timeout before connecting successfully 
17/07/05 03:24:45 WARN NettyRpcEnv: Ignored failure: java.io.IOException: Connecting to Palehorse/198.105.254.63:7077 timed out (120000 ms) 
17/07/05 03:24:45 WARN Worker: Failed to connect to master Palehorse:7077 
org.apache.spark.SparkException: Exception thrown in awaitResult 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) 
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
    at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) 
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) 
    at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) 
    at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) 
    at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:218) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.io.IOException: Connecting to Palehorse/198.105.254.63:7077 timed out (120000 ms) 
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:229) 
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) 
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) 
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194) 
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190) 
    ... 4 more 
17/07/05 03:25:09 INFO Worker: Retrying connection to master (attempt # 9) 
17/07/05 03:25:09 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:25:45 INFO Worker: Retrying connection to master (attempt # 10) 
17/07/05 03:25:45 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:26:21 INFO Worker: Retrying connection to master (attempt # 11) 
17/07/05 03:26:21 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:26:57 INFO Worker: Retrying connection to master (attempt # 12) 
17/07/05 03:26:57 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:27:09 ERROR RpcOutboxMessage: Ask timeout before connecting successfully 
17/07/05 03:27:09 WARN NettyRpcEnv: Ignored failure: java.io.IOException: Connecting to Palehorse/198.105.254.63:7077 timed out (120000 ms) 
17/07/05 03:27:09 WARN Worker: Failed to connect to master Palehorse:7077 
org.apache.spark.SparkException: Exception thrown in awaitResult 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) 
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
    at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) 
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) 
    at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) 
    at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) 
    at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:218) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.io.IOException: Connecting to Palehorse/198.105.254.63:7077 timed out (120000 ms) 
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:229) 
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) 
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) 
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194) 
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190) 
    ... 4 more 
17/07/05 03:27:33 INFO Worker: Retrying connection to master (attempt # 13) 
17/07/05 03:27:33 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:28:09 INFO Worker: Retrying connection to master (attempt # 14) 
17/07/05 03:28:09 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:28:45 INFO Worker: Retrying connection to master (attempt # 15) 
17/07/05 03:28:45 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:29:21 INFO Worker: Retrying connection to master (attempt # 16) 
17/07/05 03:29:21 INFO Worker: Connecting to master Palehorse:7077... 
17/07/05 03:29:33 ERROR RpcOutboxMessage: Ask timeout before connecting successfully 
17/07/05 03:29:33 WARN NettyRpcEnv: Ignored failure: java.io.IOException: Connecting to Palehorse/198.105.254.63:7077 timed out (120000 ms) 
17/07/05 03:29:33 WARN Worker: Failed to connect to master Palehorse:7077 
org.apache.spark.SparkException: Exception thrown in awaitResult 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) 
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
    at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) 
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) 
    at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) 
    at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) 
    at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:218) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.io.IOException: Connecting to Palehorse/198.105.254.63:7077 timed out (120000 ms) 
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:229) 
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) 
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) 
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194) 
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190) 
    ... 4 more 
17/07/05 03:29:57 ERROR Worker: All masters are unresponsive! Giving up. 
+0

Canワーカーマシンから '192.168.0.106'をpingしていますか?マスターマシンはワーカーマシンにpingできますか?あなたのログから:_「Palehorseへの接続/ 198.105.254.63:7077タイムアウト」_IPとは何ですか? –

+0

さて、すべてのマシンはお互いに話すことができます...私はお互いにそれらのそれぞれにsshすることができます。私はちょっと考えました。私がstat-all.shを実行しているとき、スレーブノードごとにパスワードを入力するように求められます。おそらく、これは別の方法で起こっていると思うかもしれませんが、それはパスワードが動作していないために私にプロンプ​​トを出さないからです。これは正常なのでしょうか、またはいくつかのユーザー設定を変更する必要がありますか? –

+0

スタンドアロンワーカーを手動で起動して、例外があれば表示することはできますか?私はそれを修正するまで 'start-all.sh'を避けたいと思います。 –

答えて

0

私が話をする奴隷を取得することができましたついにマスターに。私の/etc/hostsファイルのマスター名が127.0.1.1アドレスに設定されていて、もう1つの問題がstart-all.shの問題であると思われた場合は、start-slave.sh spark://<master ip address>:7077

関連する問題