2017-06-08 9 views
0

私はスカラ2.1.1およびJava 8のみに依存するアプリケーションでSparkクラスタを使用しようとしています。java.lang.ClassNotFoundExceptionスカラ・アプリケーションをスタンドアロン・スパーク・クラスタに提出するとき

私のクラスタは2つのノードと1つのマスターで構成されており、それぞれにすべての依存関係(jars)、同じ名前のアカウント(spark)と同じOS(Ubuntu 16.04.2 LTS)。

私はIntelliJIdeaで実行しようとしているコード:

import org.apache.spark.{SparkConf, SparkContext} 
import org.apache.spark.graphx._ 

object Main extends App { 

val sparkConf = new SparkConf() 
      .setAppName("Application") 
      .setMaster("spark://<IP-Address-Master>:7077") 
val sparkContext = new SparkContext(sparkConf) 
sparkContext.setLogLevel("ERROR") 

val NB_VERTICES = 50 // vertices count (TO ADAPT) 
val DENSITY = 50 // graph density (TO ADAPT) 

var graph = generateGraph(NB_VERTICES, sparkContext, DENSITY)// graph generation based on vertices number and density 

var hasChanged = true // boolean to loop over 

while(hasChanged){ 
    previousGraph = graph // Save previous graph 
    graph = execute(graph, 1) // Execute 1 iteration of our algorithm 
    hasChanged = hasGraphChanged(previousGraph, graph) // Verify if it has changed, if it's false we break out of the loop 
     }  
} 

精度:私はそれはポストが長すぎるになるだろうと思うので、私は「generateGraph」などのような機能を入れていません。 しかし、知っておくことは重要です:このコードは、クラスタの代わりにローカルで実行されたときに完全に機能します。 スパーク・グラフX、スカラ、およびJavaのみに依存します。

私は、クラスタ稼働を(各ワーカーは、Web UIに登録し、目に見える)がある場合ので、私はこのアプリケーションを実行しようと、私は次のエラーを取得する:

17/06/08 16:05:00 INFO SparkContext: Running Spark version 2.1.0 
17/06/08 16:05:01 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/06/08 16:05:01 WARN Utils: Your hostname, workstation resolves to a loopback address: 127.0.1.1; using 172.16.24.203 instead (on interface enx28f10e4fec2a) 
17/06/08 16:05:01 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
17/06/08 16:05:01 INFO SecurityManager: Changing view acls to: spark 
17/06/08 16:05:01 INFO SecurityManager: Changing modify acls to: spark 
17/06/08 16:05:01 INFO SecurityManager: Changing view acls groups to: 
17/06/08 16:05:01 INFO SecurityManager: Changing modify acls groups to: 
17/06/08 16:05:01 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); groups with view permissions: Set(); users with modify permissions: Set(spark); groups with modify permissions: Set() 
17/06/08 16:05:02 INFO Utils: Successfully started service 'sparkDriver' on port 42652. 
17/06/08 16:05:02 INFO SparkEnv: Registering MapOutputTracker 
17/06/08 16:05:02 INFO SparkEnv: Registering BlockManagerMaster 
17/06/08 16:05:02 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 
17/06/08 16:05:02 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 
17/06/08 16:05:02 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-fe269631-606f-4e03-a75a-82809f4dce2d 
17/06/08 16:05:02 INFO MemoryStore: MemoryStore started with capacity 869.7 MB 
17/06/08 16:05:02 INFO SparkEnv: Registering OutputCommitCoordinator 
17/06/08 16:05:02 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
17/06/08 16:05:02 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.16.24.203:4040 
17/06/08 16:05:02 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://172.16.24.203:7077... 
17/06/08 16:05:03 INFO TransportClientFactory: Successfully created connection to /172.16.24.203:7077 after 50 ms (0 ms spent in bootstraps) 
17/06/08 16:05:03 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20170608160503-0000 
17/06/08 16:05:03 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42106. 
17/06/08 16:05:03 INFO NettyBlockTransferService: Server created on 172.16.24.203:42106 
17/06/08 16:05:03 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 
17/06/08 16:05:03 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.16.24.203, 42106, None) 
17/06/08 16:05:03 INFO BlockManagerMasterEndpoint: Registering block manager 172.16.24.203:42106 with 869.7 MB RAM, BlockManagerId(driver, 172.16.24.203, 42106, None) 
17/06/08 16:05:03 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.16.24.203, 42106, None) 
17/06/08 16:05:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20170608160503-0000/0 on worker-20170608145510-172.16.24.196-41159 (172.16.24.196:41159) with 8 cores 
17/06/08 16:05:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20170608160503-0000/0 on hostPort 172.16.24.196:41159 with 8 cores, 1024.0 MB RAM 
17/06/08 16:05:03 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.16.24.203, 42106, None) 
17/06/08 16:05:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20170608160503-0000/1 on worker-20170608185509-172.16.24.210-42227 (172.16.24.210:42227) with 4 cores 
17/06/08 16:05:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20170608160503-0000/1 on hostPort 172.16.24.210:42227 with 4 cores, 1024.0 MB RAM 
17/06/08 16:05:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20170608160503-0000/0 is now RUNNING 
17/06/08 16:05:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20170608160503-0000/1 is now RUNNING 
17/06/08 16:05:03 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 
17/06/08 16:05:10 ERROR TaskSetManager: Task 1 in stage 6.0 failed 4 times; aborting job 
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 6.0 failed 4 times, most recent failure: Lost task 1.3 in stage 6.0 (TID 14, 172.16.24.196, executor 0): java.lang.ClassNotFoundException: Main$$anonfun$3 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
    at java.lang.Class.forName0(Native Method) 
    at java.lang.Class.forName(Class.java:348) 
    at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) 
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1819) 
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1986) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) 
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) 
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) 
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85) 
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) 
    at org.apache.spark.scheduler.Task.run(Task.scala:99) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 

Driver stacktrace: 
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422) 
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) 
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802) 
    at scala.Option.foreach(Option.scala:257) 
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594) 
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1918) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1931) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1944) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958) 
    at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:935) 
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) 
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) 
    at org.apache.spark.rdd.RDD.collect(RDD.scala:934) 
    at Main$.hasGraphChanged(Main.scala:168) 
    at Main$.main(Main.scala:401) 
    at Main.main(Main.scala) 
Caused by: java.lang.ClassNotFoundException: Main$$anonfun$3 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
    at java.lang.Class.forName0(Native Method) 
    at java.lang.Class.forName(Class.java:348) 
    at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) 
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1819) 
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1986) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) 
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) 
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) 
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) 
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85) 
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) 
    at org.apache.spark.scheduler.Task.run(Task.scala:99) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 

彼らはすべての登録が表示され、とにかく失敗します。

私はspark-shellで簡単なPi近似を実行しようとしましたが、うまく動作し、クラスタ上で分散しました。私はそれが私はここで提案された方法の多くを試してみることができるか分からない(手動で追加するためにSparkConf.addJarsを使用して各ノードでJARSのenvを設定するなど)、私はまだこのエラーがあります。

誰でも可能性がありますか?

ありがとうございます。

+0

'generateGraph'は何をしていますか? –

+0

'generateGraph'は、頂点の数と密度に基づいてグラフ(Graphxからのグラフ)を生成します。各頂点はVertexIdと色(intで表される)を含み、各辺はsrcIdとdstIdの連結で構成される文字列属性のみを持ちます。 'sparkContext.makeRDD(array)'を使って私の辺と頂点を作成します。arrayはTuple2 VertexIdの配列、頂点のInt、辺のEdgeの配列です。 –

答えて

0

コード全体がコードで示されていますか?もしそうなら、それはまさに問題です。

object SparkAppと入力してmainの入力方法でコードを折り返してやり直してください。

object SparkApp { 
    def main(args: Array[String]): Unit = { 
    // ...your code here 
    } 
} 

またobject SparkApp extends Appを使用することができますが、それは失敗に時々リードに知られています。

スパーク2.1.1の最新かつ最高のものを使用することを強くお勧めします。

+0

私はちょうどコードを編集しましたが、下のようにローカル( '.setMaster( "local [*]")')で動作します。アップグレードしません。 –

+0

コードの容量が多すぎるため、コードがいっぱいではありません。 –

関連する問題