I have the below setup:スプリングブート - スパークアプリケーション:ファイルを処理できません、そのノード上
Spark Master and Slaves configured and running in my local.
17/11/01 18:03:52 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
17/11/01 18:03:52 INFO Master: Starting Spark master at spark://127.0.0.1:7077
17/11/01 18:03:52 INFO Master: Running Spark version 2.2.0
17/11/01 18:03:52 INFO Utils: Successfully started service 'MasterUI' on port 8080.
I have a spring boot application whose properties file contents look like the below:
spark.home =は/ usr/local /セラー/ apacheのスパーク/ 2.2.0/binに/ マスター。 URI =スパーク://127.0.0.1:
17/11/01 18:16:36 INFO TorrentBroadcast: Started reading broadcast variable 2
17/11/01 18:16:36 INFO TransportClientFactory: Successfully created connection to /192.168.0.135:51903 after 1 ms (0 ms spent in bootstraps)
17/11/01 18:16:36 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 24.4 KB, free 366.3 MB)
17/11/01 18:16:36 INFO TorrentBroadcast: Reading broadcast variable 2 took 82 ms
17/11/01 18:16:36 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 67.2 KB, free 366.2 MB)
17/11/01 18:16:36 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2251)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
17/11/01 18:16:36 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
:7077
@Autowired
SparkConf sparkConf;
public void processFile(String inputFile, String outputFile) {
JavaSparkContext javaSparkContext;
SparkContext sc = new SparkContext(sparkConf);
SerializationWrapper sw= new SerializationWrapper() {
private static final long serialVersionUID = 1L;
@Override
public JavaSparkContext createJavaSparkContext() {
// TODO Auto-generated method stub
return JavaSparkContext.fromSparkContext(sc);
}
};
javaSparkContext=sw.createJavaSparkContext();
JavaRDD<String> lines = javaSparkContext.textFile(inputFile);
Broadcast<JavaRDD<String>> outputLines;
outputLines = javaSparkContext.broadcast(lines.map(new Function<String, String>() {
/**
*
*/
private static final long serialVersionUID = 1L;
@Override
public String call(String arg0) throws Exception {
// TODO Auto-generated method stub
return arg0;
}
}));
outputLines.getValue().saveAsTextFile(outputFile);
//javaSparkContext.close();
}
私は以下のエラーを取得していたコードを実行すると、
The springboot-spark app should process the files based on REST API call where i get the input and output file location shared across the Spark nodes.
Any suggestions to fix the above errors
ブロードキャスターなしで試してみましたが、これはうまくいきませんでした:-(同じエラーが発生しました – user3709612
それは環境問題です。[This](https://issues.apache.org/jira/browse/SPARK) -19938)が役に立ちます。 – Arthur