スパーククラスタ(7 * 2コア)は、spark 2.0.2でhdfsクラスタの隣に設定されています。スパークエグゼキュータが不思議なポート35529に接続できません
私はJupyterを使っていくつかのhdfsファイルを読むと、14コアと3を使ってアプリケーションが起動するのを見ていますが、ネットワークが奇妙な "localhost"ポートに接続できないため、 35529.私が見
spark = SparkSession.builder.master(master).appName(appName).config("spark.executor.instances", 3).getOrCreate()
sc = spark.sparkContext
hdfs_master = "hdfs://xx.xx.xx.xx:8020"
hdfs_path = "/logs/cycliste_debug/2017/2017_02/2017_02_20/23h/*"
infos = sc.textFile(hdfs_master+hdfs_path)
(それは私が唯一の3 * 2が可能な場合、14個のコアが割り当てられて見ることが奇妙だと思いますCPUのすなわちspark.executor.instances * NBノード別):
キュータアプリ-20170227140938から0009のための要約:
ExecutorID Worker Cores Memory State ▾ Logs
1488 worker-20170227125912-xx.xx.xx.xx-38028 2 1024 RUNNING stdout stderr
1489 worker-20170227125954-xx.xx.xx.xx-48962 2 1024 RUNNING stdout stderr
5 worker-20170227125959-xx.xx.xx.xx-48149 2 1024 RUNNING stdout stderr
1486 worker-20170227130012-xx.xx.xx.xx-47639 2 1024 RUNNING stdout stderr
1490 worker-20170227130027-xx.xx.xx.xx-44921 2 1024 RUNNING stdout stderr
1485 worker-20170227130152-xx.xx.xx.xx-50620 2 1024 RUNNING stdout stderr
1487 worker-20170227130248-xx.xx.xx.xx-42100 2 1024 RUNNING stdout stderr
と1つのワーカーのためのエラーの例:アプリ-20170227140938ため
標準エラーログページここは、クラスタの要約であります-0009/1488:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/02/27 14:37:57 INFO CoarseGrainedExecutorBackend: Started daemon with process name: [email protected]
17/02/27 14:37:57 INFO SignalUtils: Registered signal handler for TERM
17/02/27 14:37:57 INFO SignalUtils: Registered signal handler for HUP
17/02/27 14:37:57 INFO SignalUtils: Registered signal handler for INT
17/02/27 14:37:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/27 14:37:58 INFO SecurityManager: Changing view acls to: spark
17/02/27 14:37:58 INFO SecurityManager: Changing modify acls to: spark
17/02/27 14:37:58 INFO SecurityManager: Changing view acls groups to:
17/02/27 14:37:58 INFO SecurityManager: Changing modify acls groups to:
17/02/27 14:37:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); groups with view permissions: Set(); users with modify permissions: Set(spark); groups with modify permissions: Set()
17/02/27 14:38:01 WARN ThreadLocalRandom: Failed to generate a seed from SecureRandom within 3 seconds. Not enough entrophy?
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:174)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:270)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.io.IOException: Failed to connect to localhost/127.0.0.1:35529
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: localhost/127.0.0.1:35529
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
2つのプロセス間の単純な通信の問題です。
だから私は、/ etc/hostsの表示:
127.0.0.1 localhost
193.xx.xx.xxx vpsxxxx.ovh.net vpsxxxx
任意のアイデアを?