2017-11-19 18 views
-1

下記のコマンドを使用してPython(.pyファイル)を使用してスパークジョブを実行しようとしています。 $ SPARK_HOME/binに/火花提出する〜/プロジェクト/ SparkTest.py --py-ファイル〜/プロジェクト/ SparkTest.py pythonファイルを使用してスパークジョブをサブミットできません

ジョブは例外で失敗「MASTERのURLを解析することができませんでした。 ' 「

は、私はいくつかのデバッグを行なったし、ジョブを開始したとき、spark.masterの値がに設定されていることが判明 『の代わりに『スパーク://10.0.0.5:31016』の』私の主人でありますipとportはspark-defaults.confで設定されています

ここ私は右の投稿後に解決策を見つけた

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
17/11/19 22:25:43 INFO SparkContext: Running Spark version 2.2.0 
17/11/19 22:25:43 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/11/19 22:25:44 INFO SparkContext: Submitted application: SparkTest.py 
17/11/19 22:25:44 INFO SparkContext: Spark configuration: 
spark.app.name=SparkTest.py 
spark.driver.cores=2 
spark.driver.memory=3g 
spark.eventLog.dir=hdfs://10.0.0.5:31001/spark_log 
spark.eventLog.enabled=true 
spark.executor.memory=3g 
spark.files=file:/home/admin/Project/SparkTest.py 
spark.kryoserializer.buffer.max=1536m 
spark.logConf=true 
spark.master=<pyspark.conf.SparkConf object at 0x7fb6b70e3898> 
spark.rdd.compress=True 
spark.serializer=org.apache.spark.serializer.KryoSerializer 
spark.serializer.objectStreamReset=100 
spark.submit.deployMode=client 
17/11/19 22:25:44 INFO SecurityManager: Changing view acls to: admin 
17/11/19 22:25:44 INFO SecurityManager: Changing modify acls to: admin 
17/11/19 22:25:44 INFO SecurityManager: Changing view acls groups to: 
17/11/19 22:25:44 INFO SecurityManager: Changing modify acls groups to: 
17/11/19 22:25:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); users with modify permissions: Set(admin); groups with modify permissions: Set() 
17/11/19 22:25:44 INFO Utils: Successfully started service 'sparkDriver' on port 41829. 
17/11/19 22:25:44 INFO SparkEnv: Registering MapOutputTracker 
17/11/19 22:25:44 INFO SparkEnv: Registering BlockManagerMaster 
17/11/19 22:25:44 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 
17/11/19 22:25:44 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 
17/11/19 22:25:44 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-4007fc95-6531-4447-a095-0730713d7758 
17/11/19 22:25:44 INFO MemoryStore: MemoryStore started with capacity 1458.6 MB 
17/11/19 22:25:44 INFO SparkEnv: Registering OutputCommitCoordinator 
17/11/19 22:25:44 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
17/11/19 22:25:44 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.0.5:4040 
17/11/19 22:25:44 INFO SparkContext: Added file file:/home/admin/Project/SparkTest.py at spark://10.0.0.5:41829/files/SparkTest.py with timestamp 1511130344827 
17/11/19 22:25:44 INFO Utils: Copying /home/admin/Project/SparkTest.py to /tmp/spark-940a6faa-cf59-4d47-87c6-b3f39296c19d/userFiles-d3c17550-6141-496d-aacd-0f83f813a3a0/SparkTest.py 
17/11/19 22:25:44 ERROR SparkContext: Error initializing SparkContext. 
org.apache.spark.SparkException: Could not parse Master URL: '<pyspark.conf.SparkConf object at 0x7fb6b70e3898>' 
     at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2760) 
     at org.apache.spark.SparkContext.<init>(SparkContext.scala:501) 
     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
     at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 
     at py4j.Gateway.invoke(Gateway.java:236) 
     at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) 
     at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) 
     at py4j.GatewayConnection.run(GatewayConnection.java:214) 
     at java.lang.Thread.run(Thread.java:748) 
17/11/19 22:25:44 INFO SparkUI: Stopped Spark web UI at http://10.0.0.5:4040 
17/11/19 22:25:44 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
17/11/19 22:25:44 INFO MemoryStore: MemoryStore cleared 
17/11/19 22:25:44 INFO BlockManager: BlockManager stopped 
17/11/19 22:25:44 INFO BlockManagerMaster: BlockManagerMaster stopped 
17/11/19 22:25:44 WARN MetricsSystem: Stopping a MetricsSystem that is not running 
17/11/19 22:25:44 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
17/11/19 22:25:44 INFO SparkContext: Successfully stopped SparkContext 
Traceback (most recent call last): 
    File "/home/admin/Project/SparkTest.py", line 21, in <module> 
    sc = SparkContext(conf) 
    File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 118, in __init__ 
    File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 180, in _do_init 
    File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 273, in _initialize_context 
    File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__ 
    File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value 
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. 
: org.apache.spark.SparkException: Could not parse Master URL: '<pyspark.conf.SparkConf object at 0x7fb6b70e3898>' 
     at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2760) 
     at org.apache.spark.SparkContext.<init>(SparkContext.scala:501) 
     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
     at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 
     at py4j.Gateway.invoke(Gateway.java:236) 
     at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) 
     at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) 
     at py4j.GatewayConnection.run(GatewayConnection.java:214) 
     at java.lang.Thread.run(Thread.java:748) 

17/11/19 22:25:44 INFO ShutdownHookManager: Shutdown hook called 
17/11/19 22:25:44 INFO ShutdownHookManager: Deleting directory /tmp/spark-940a6faa-cf59-4d47-87c6-b3f39296c19d 

答えて

2

スパークジョブを送信した後、完全な出力は「SparkContext」のインスタンス化中に、私はSparkContext(CONF =として、それを変更、パラメータ名を使用して直接のconfを渡しました。 conf)が問題を解決しました。

+0

これは私を助けます! – igonejack

関連する問題