2016-08-26 10 views
2

YARN(Hadoop)でApache Sparkアプリケーションを起動します。このアプリケーションは正しく動作しますが、受け入れと実行を待つプロセスは長すぎます。たとえば、小さなファイル(〜100語)で単語を数えます。私はアプリを始めている:Apache Sparkアプリケーションを一度起動し、データが処理されるのを待ちます。

/opt/spark/bin/spark-submit --class org.apache.spark.examples.JavaWordCount --deploy-mode cluster --master yarn --driver-memory 2g --executor-memory 2g /opt/spark/examples/jars/spark-examples_2.11-2.0.0.jar hdfs://hadoop-master:9000/input/file.txt 

をし、私は待っている:
- ACCEPTED - 11Sを、
- RUNNING - 25S
ACCEPTED前と実行した後、数秒ほか:

16/08/26 15:18:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/08/26 15:18:27 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.29.74.68:8032 
16/08/26 15:18:27 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers 
16/08/26 15:18:27 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4096 MB per container) 
16/08/26 15:18:27 INFO yarn.Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead 
16/08/26 15:18:27 INFO yarn.Client: Setting up container launch context for our AM 
16/08/26 15:18:27 INFO yarn.Client: Setting up the launch environment for our AM container 
16/08/26 15:18:27 INFO yarn.Client: Preparing resources for our AM container 
16/08/26 15:18:27 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 
16/08/26 15:18:32 INFO yarn.Client: Uploading resource file:/tmp/spark-b8aa8874-9747-4c1f-8390-d0abbad019ee/__spark_libs__3386575858123884242.zip -> hdfs://hadoop-master:9000/user/root/.sparkStaging/application_1472201718061_0015/__spark_libs__3386575858123884242.zip 
16/08/26 15:18:37 INFO yarn.Client: Uploading resource file:/opt/spark/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://hadoop-master:9000/user/root/.sparkStaging/application_1472201718061_0015/spark-examples_2.11-2.0.0.jar 
16/08/26 15:18:37 INFO yarn.Client: Uploading resource file:/tmp/spark-b8aa8874-9747-4c1f-8390-d0abbad019ee/__spark_conf__1130150930664135048.zip -> hdfs://hadoop-master:9000/user/root/.sparkStaging/application_1472201718061_0015/__spark_conf__.zip 
16/08/26 15:18:37 INFO spark.SecurityManager: Changing view acls to: root 
16/08/26 15:18:37 INFO spark.SecurityManager: Changing modify acls to: root 
16/08/26 15:18:37 INFO spark.SecurityManager: Changing view acls groups to: 
16/08/26 15:18:37 INFO spark.SecurityManager: Changing modify acls groups to: 
16/08/26 15:18:37 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 
16/08/26 15:18:37 INFO yarn.Client: Submitting application application_1472201718061_0015 to ResourceManager 
16/08/26 15:18:37 INFO impl.YarnClientImpl: Submitted application application_1472201718061_0015 
16/08/26 15:18:38 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:38 INFO yarn.Client: 
    client token: N/A 
    diagnostics: N/A 
    ApplicationMaster host: N/A 
    ApplicationMaster RPC port: -1 
    queue: default 
    start time: 1472217517552 
    final status: UNDEFINED 
    tracking URL: http://hadoop-master:8088/proxy/application_1472201718061_0015/ 
    user: root 
16/08/26 15:18:39 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:40 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:41 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:42 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:43 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:44 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:45 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:46 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:47 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:48 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:49 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:50 INFO yarn.Client: Application report for application_1472201718061_0015 (state: ACCEPTED) 
16/08/26 15:18:51 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:51 INFO yarn.Client: 
    client token: N/A 
    diagnostics: N/A 
    ApplicationMaster host: 172.29.77.40 
    ApplicationMaster RPC port: 0 
    queue: default 
    start time: 1472217517552 
    final status: UNDEFINED 
    tracking URL: http://hadoop-master:8088/proxy/application_1472201718061_0015/ 
    user: root 
16/08/26 15:18:52 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:53 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:54 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:55 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:56 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:57 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:58 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:18:59 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:00 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:01 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:02 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:03 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:04 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:05 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:06 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:07 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:08 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:09 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:10 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:11 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:12 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:13 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:14 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:15 INFO yarn.Client: Application report for application_1472201718061_0015 (state: RUNNING) 
16/08/26 15:19:16 INFO yarn.Client: Application report for application_1472201718061_0015 (state: FINISHED) 
16/08/26 15:19:16 INFO yarn.Client: 
    client token: N/A 
    diagnostics: N/A 
    ApplicationMaster host: 172.29.77.40 
    ApplicationMaster RPC port: 0 
    queue: default 
    start time: 1472217517552 
    final status: SUCCEEDED 
    tracking URL: http://hadoop-master:8088/proxy/application_1472201718061_0015/ 
    user: root 
16/08/26 15:19:16 INFO util.ShutdownHookManager: Shutdown hook called 
16/08/26 15:19:16 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-b8aa8874-9747-4c1f-8390-d0abbad019ee 

それは私にとっては長すぎます。 私は一度それを起動したいと思うし、それは動作し、データを待つ必要があります。私はそれにファイルを与えた後、それはデータを処理し、私に結果を与え、次のファイルを待つ状態に戻るべきです。 これはYARN上で動作するApache Sparkで可能ですか?

答えて

1

はい、可能であり、連続的な方法でバッチ式処理を行うことを可能にするSpark Streamingと呼ばれています。

関連する問題