私はScalaとSparkを初めて使用しています。私は単純なコードを作って、それは私のローカルマシンで正常に実行されました。そこで、私は.jar
のファイルをmavenで作成し、それを私のクラスタマシンにコピーして分散システム上でテストしました。しかし、私は自分のコマンドを始めました。コンソールは以下のようにエラーを投げます。複数のスカラバージョンを使用していますか?
*******CLASSPATH = ********
java.lang.ClassNotFoundException: App
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.spark.util.Utils$.classForName(Utils.scala:225)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:693)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
私はすでにそれをGoogleで検索し、クラス名がpackage name + class name
でなければならないことが分かりました。しかし、私の場合はうまくいかなかった。そして私は、複数のスカラバージョンを使用している別の原因を発見しました。そこで私はpom.xml
を調べて、スカラとスパークのバージョンを確認しました。私は、バージョンクラスタのバージョンに応じてバージョン名を変更しましたが、結果も同じでした。
以下は、私のpom.xmlです。
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sclee.scala0</groupId>
<artifactId>scala_tutorial</artifactId>
<version>1.0-SNAPSHOT</version>
<name>${project.artifactId}</name>
<description>My wonderfull scala app</description>
<inceptionYear>2015</inceptionYear>
<licenses>
<license>
<name>My License</name>
<url>http://....</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.10.4</scala.version>
<scala.compat.version>2.10</scala.compat.version>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.0.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.4</version>
</dependency>
</dependencies>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<build>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<id>compile</id>
<goals>
<goal>compile</goal>
</goals>
<phase>compile</phase>
</execution>
<execution>
<id>test-compile</id>
<goals>
<goal>testCompile</goal>
</goals>
<phase>test-compile</phase>
</execution>
<execution>
<phase>process-resources</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
そして私は、きれいにコンパイルとMavenパッケージを使用してパッケージ化しようとしたときに、あるいくつかの警告があります。 (私はこの警告が私の結果に関連するかもしれないとは確信していません。しかし、この問題を解決するために、私は以下のようなログメッセージを添付しました)。
/usr/local/java/jdk1.7.0_80/bin/java -Dmaven.home=/usr/local/maven/apache-maven-3.1.1 -Dclassworlds.conf=/usr/local/maven/apache-maven-3.1.1/bin/m2.conf -Didea.launcher.port=7536 -Didea.launcher.bin.path=/usr/local/intellij/idea-IC-163.10154.41/bin -Dfile.encoding=UTF-8 -classpath /usr/local/maven/apache-maven-3.1.1/boot/plexus-classworlds-2.5.1.jar:/usr/local/intellij/idea-IC-163.10154.41/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain org.codehaus.classworlds.Launcher -Didea.version=2016.3.2 clean compile package
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.sclee.scala0:scala_tutorial:jar:1.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 92, column 15
[WARNING] 'build.plugins.plugin.version' for org.scala-tools:maven-scala-plugin is missing. @ line 65, column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building scala_tutorial 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
Downloading: http://scala-tools.org/repo-releases/org/scala-tools/maven-scala-plugin/maven-metadata.xml
[WARNING] Could not transfer metadata org.scala-tools:maven-scala-plugin/maven-metadata.xml from/to scala-tools.org (http://scala-tools.org/repo-releases): peer not authenticated
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ scala_tutorial ---
[INFO] Deleting /home/dst/Documents/01_Intellij_workspace/scala_tutorial/target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ scala_tutorial ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/main/resources
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (default) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/main/scala:-1: info: compiling
[INFO] Compiling 1 source files to /home/dst/Documents/01_Intellij_workspace/scala_tutorial/target/classes at 1486089232696
[WARNING] warning: there were 2 deprecation warning(s); re-run with -deprecation for details
[WARNING] one warning found
[INFO] prepare-compile in 0 s
[INFO] compile in 6 s
[INFO]
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ scala_tutorial ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (compile) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ scala_tutorial ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/main/resources
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (default) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ scala_tutorial ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (compile) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ scala_tutorial ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ scala_tutorial ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:testCompile (test-compile) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[WARNING] No source files found.
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ scala_tutorial ---
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ scala_tutorial ---
[INFO] Building jar: /home/dst/Documents/01_Intellij_workspace/scala_tutorial/target/scala_tutorial-1.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15.990s
[INFO] Finished at: Thu Feb 02 21:34:00 EST 2017
[INFO] Final Memory: 27M/291M
[INFO] ------------------------------------------------------------------------
Process finished with exit code 0
Scalaバージョンクラスタの用途は2.10.4
です。そこで、pom.xmlのすべてのバージョンを変更しようとしました。私のスカラコードは以下の通りです。 DataFrameを使用してデータを変換するのは簡単な作業です。
私はscalaで書かれたjarファイルを使い、クラスタモードでテストしたいと思います。そして私の質問には、間違ったプロセス(バージョンの問題など)がありますか?どんな助けもありがとう。
package com.sclee.scala0
import org.apache.spark._
import org.apache.spark.sql.SQLContext
object App {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("wordCount").setMaster("local[*]")
// Create a Scala Spark Context.
val sc = new SparkContext(conf)
val sqlContext= new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
val df = sc.textFile(args(0)).map(_.split('\t')).map(x => (x(0),x(1),x(2),x(3),x(4),x(5),x(6))).toDF("c1","c2","c3","c4","c5","c6","c7")
val res = df.explode("c7","c8")((x:String) => x.split('|')).drop("c7")
res.write.format("com.databricks.spark.csv").option("delimiter","\t").save(args(1))
}
}
最後に、これはscala jarを実行するコマンドです。
spark-submit \
--class App \
--master spark://spark.dso.xxxx \
--executor-memory 10G \
/home/jumbo/user/sclee/dt/jars/scala_tutorial-1.0-SNAPSHOT.jar \
/user/sclee/data_big/ /user/sclee/output_scala_csv
提案した変数を変更したときに一致するバージョンがありません。 scala.versionは2.11.xで、spark-mllibとspark-sql libには2.11.xと互換性のあるバージョンがないためです。 – sclee1
使用しているScalaのバージョンは?あなたのpom.xmlには、2.10.4を挙げました。 –
Scala 2.11.Xでは、spark-mlibおよびspark-sqlライブラリが利用できます。このリンクをたどってmavenリポジトリhttps://mvnrepository.com/artifact/org.apache.sparkを入手することもできますが、クラスタのScalaバージョンは2.10です。これを 'pom.xml'でも使用する必要があります –