2016-11-18 2 views
0

私は、1人のワーカでSpark Standaloneクラスタを使用してSimpleApp.javaを試していました。しかし、私が得るすべては、すべての変更後、エラースレッド "main"の例外org.apache.spark.SparkException:ジョブが中止されました:sp​​ark clusterが下に見える

Exception in thread "main" org.apache.spark.SparkException: Job aborted: Spark cluster looks down 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018) 
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) 
    at scala.Option.foreach(Option.scala:236) 
    at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) 
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) 
    at akka.actor.ActorCell.invoke(ActorCell.scala:456) 
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) 
    at akka.dispatch.Mailbox.run(Mailbox.scala:219) 
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) 
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 

次され、私は次のセットアップにはlocalhost

  • 上で実行されている

    • スタンドアロンのマスターを持っており、労働者 次enter image description here

    を追加しました行はマスターログからのものです

    Spark Command: /usr/lib/jvm/java-7-oracle/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.master.Master --host 192.168.97.128 --port 7077 --webui-port 8080 
    ======================================== 
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
    16/11/18 12:36:57 INFO Master: Started daemon with process name: [email protected] 
    16/11/18 12:36:57 INFO SignalUtils: Registered signal handler for TERM 
    16/11/18 12:36:57 INFO SignalUtils: Registered signal handler for HUP 
    16/11/18 12:36:57 INFO SignalUtils: Registered signal handler for INT 
    16/11/18 12:36:57 WARN MasterArguments: SPARK_MASTER_IP is deprecated, please use SPARK_MASTER_HOST 
    16/11/18 12:36:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
    16/11/18 12:36:58 INFO SecurityManager: Changing view acls to: vinay 
    16/11/18 12:36:58 INFO SecurityManager: Changing modify acls to: vinay 
    16/11/18 12:36:58 INFO SecurityManager: Changing view acls groups to: 
    16/11/18 12:36:58 INFO SecurityManager: Changing modify acls groups to: 
    16/11/18 12:36:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vinay); groups with view permissions: Set(); users with modify permissions: Set(vinay); groups with modify permissions: Set() 
    16/11/18 12:36:59 INFO Utils: Successfully started service 'sparkMaster' on port 7077. 
    16/11/18 12:36:59 INFO Master: Starting Spark master at spark://192.168.97.128:7077 
    16/11/18 12:36:59 INFO Master: Running Spark version 2.0.1 
    16/11/18 12:36:59 INFO Utils: Successfully started service 'MasterUI' on port 8080. 
    16/11/18 12:36:59 INFO MasterWebUI: Bound MasterWebUI to 192.168.97.128, and started at http://192.168.97.128:8080 
    16/11/18 12:36:59 INFO Utils: Successfully started service on port 6066. 
    16/11/18 12:36:59 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066 
    16/11/18 12:36:59 INFO Master: I have been elected leader! New state: ALIVE 
    16/11/18 12:38:58 INFO Master: 192.168.97.128:34770 got disassociated, removing it. 
    

    SimpleApp.java

    SPARK_MASTER_HOST=192.168.97.128 
    SPARK_MASTER_IP=192.168.97.128 
    SPARK_LOCAL_IP=192.168.97.128 
    SPARK_PUBLIC_DNS=192.168.97.128 
    SPARK_WORKER_CORES=2 
    SPARK_WORKER_MEMORY=2g 
    

    spark-env.shそしてenvironement変数で

    修正設定エントリとともに

    public static void main(String[] args) { 
         System.out.println("hellow world!!"); 
        String logFile = "/usr/local/spark/README.md"; // Should be some file on your system 
        SparkConf conf = new SparkConf().setAppName("Simple Application"); 
        conf.setMaster("spark://192.168.97.128:7077"); 
        // conf.set(key, value) 
        //conf.setMaster("local[4]"); 
        JavaSparkContext sc = new JavaSparkContext(conf); 
        JavaRDD<String> logData = sc.textFile(logFile).cache(); 
    
        long numAs = logData.filter(new Function<String, Boolean>() { 
         public Boolean call(String s) { return s.contains("a"); } 
        }).count(); 
    
        long numBs = logData.filter(new Function<String, Boolean>() { 
         public Boolean call(String s) { return s.contains("b"); } 
        }).count(); 
    
        System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs); 
    
        sc.stop(); 
        } 
    
    同様

    SPARK_LOCAL_IP=192.168.97.128 
    SPARK_MASTER_IP=192.168.97.128 
    

    アップデート1: 無料 - m出力

    [email protected]:/usr/local/spark/sbin$ free -m 
           total  used  free  shared buff/cache available 
    Mem:   7875  4500   970   531  2404  2756 
    Swap:   8082   6  8076 
    

    アップデート2:プログラムの 出力

    16/11/18 15:33:05 INFO slf4j.Slf4jLogger: Slf4jLogger started 
    16/11/18 15:33:05 INFO Remoting: Starting remoting 
    16/11/18 15:33:05 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:43526] 
    16/11/18 15:33:05 INFO Remoting: Remoting now listens on addresses: [akka.tcp://[email protected]:43526] 
    16/11/18 15:33:05 INFO spark.SparkEnv: Registering BlockManagerMaster 
    16/11/18 15:33:06 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20161118153305-9cf5 
    16/11/18 15:33:06 INFO storage.MemoryStore: MemoryStore started with capacity 1050.6 MB. 
    16/11/18 15:33:06 INFO network.ConnectionManager: Bound socket to port 46557 with id = ConnectionManagerId(192.168.97.128,46557) 
    16/11/18 15:33:06 INFO storage.BlockManagerMaster: Trying to register BlockManager 
    16/11/18 15:33:06 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager 192.168.97.128:46557 with 1050.6 MB RAM 
    16/11/18 15:33:06 INFO storage.BlockManagerMaster: Registered BlockManager 
    16/11/18 15:33:06 INFO spark.HttpServer: Starting HTTP Server 
    16/11/18 15:33:06 INFO server.Server: jetty-7.6.8.v20121106 
    16/11/18 15:33:06 INFO server.AbstractConnector: Started [email protected]:33688 
    16/11/18 15:33:06 INFO broadcast.HttpBroadcast: Broadcast server started at http://192.168.97.128:33688 
    16/11/18 15:33:06 INFO spark.SparkEnv: Registering MapOutputTracker 
    16/11/18 15:33:06 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-633ba798-963f-4b02-ab23-1edb4e677fde 
    16/11/18 15:33:06 INFO spark.HttpServer: Starting HTTP Server 
    16/11/18 15:33:06 INFO server.Server: jetty-7.6.8.v20121106 
    16/11/18 15:33:06 INFO server.AbstractConnector: Started [email protected]:46433 
    16/11/18 15:33:06 INFO server.Server: jetty-7.6.8.v20121106 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/storage/rdd,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/storage,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages/stage,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages/pool,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/environment,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/executors,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/metrics/json,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/static,null} 
    16/11/18 15:33:06 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/,null} 
    16/11/18 15:33:06 INFO server.AbstractConnector: Started [email protected]:4040 
    16/11/18 15:33:06 INFO ui.SparkUI: Started Spark Web UI at http://192.168.97.128:4040 
    16/11/18 15:33:06 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.97.128:7077... 
    16/11/18 15:33:07 INFO storage.MemoryStore: ensureFreeSpace(32856) called with curMem=0, maxMem=1101633945 
    16/11/18 15:33:07 INFO storage.MemoryStore: Block broadcast_0 stored as values to memory (estimated size 32.1 KB, free 1050.6 MB) 
    16/11/18 15:33:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
    16/11/18 15:33:08 WARN snappy.LoadSnappy: Snappy native library not loaded 
    16/11/18 15:33:08 INFO mapred.FileInputFormat: Total input paths to process : 1 
    16/11/18 15:33:08 INFO spark.SparkContext: Starting job: count at SimpleApp.java:20 
    16/11/18 15:33:08 INFO scheduler.DAGScheduler: Got job 0 (count at SimpleApp.java:20) with 2 output partitions (allowLocal=false) 
    16/11/18 15:33:08 INFO scheduler.DAGScheduler: Final stage: Stage 0 (count at SimpleApp.java:20) 
    16/11/18 15:33:08 INFO scheduler.DAGScheduler: Parents of final stage: List() 
    16/11/18 15:33:08 INFO scheduler.DAGScheduler: Missing parents: List() 
    16/11/18 15:33:08 INFO scheduler.DAGScheduler: Submitting Stage 0 (FilteredRDD[2] at filter at SimpleApp.java:18), which has no missing parents 
    16/11/18 15:33:08 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (FilteredRDD[2] at filter at SimpleApp.java:18) 
    16/11/18 15:33:08 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 
    16/11/18 15:33:23 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 
    16/11/18 15:33:26 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.97.128:7077... 
    16/11/18 15:33:38 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 
    16/11/18 15:33:46 INFO client.AppClient$ClientActor: Connecting to master spark://192.168.97.128:7077... 
    16/11/18 15:33:53 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 
    16/11/18 15:34:06 ERROR client.AppClient$ClientActor: All masters are unresponsive! Giving up. 
    16/11/18 15:34:06 ERROR cluster.SparkDeploySchedulerBackend: Spark cluster looks dead, giving up. 
    16/11/18 15:34:06 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
    16/11/18 15:34:06 INFO scheduler.DAGScheduler: Failed to run count at SimpleApp.java:20 
    Exception in thread "main" org.apache.spark.SparkException: Job aborted: Spark cluster looks down 
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020) 
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018) 
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018) 
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) 
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) 
        at scala.Option.foreach(Option.scala:236) 
        at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604) 
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) 
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) 
        at akka.actor.ActorCell.invoke(ActorCell.scala:456) 
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) 
        at akka.dispatch.Mailbox.run(Mailbox.scala:219) 
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) 
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
    
  • +0

    を役に立てば幸い再び

    を試してみてください、あなたが「自由-m」 –

    +0

    の出力を印刷することができ、プログラムの出力を追加しました。 – vinay

    答えて

    2

    空きメモリが970メガバイトですが、あなたは2ギガバイトで構成。 500メガバイトにSPARK_WORKER_MEMORYの値を与えてみて、これは

    +0

    はまだ動作していませんが、100MBも試しました。 – vinay

    +0

    この行を参照WARN scheduler.TaskSchedulerImpl:初期ジョブはリソースを受け付けませんでした。クラスタのUIをチェックして、ワーカーが登録され、十分なメモリがあることを確認します。それはメモリの問題のように思われる –

    +0

    あなたは更新されたspark-env.shを投稿することができます –

    関連する問題