1
スタンドアロンアプリケーション(java8、Windows 10、spark-xxx_2.11: jarファイルの依存関係など0)次のコードは、エラーを与える:DataFrame APIを持つApache Spark MLlibは、createDataFrame()またはread()。csv(...)のときにjava.net.URISyntaxExceptionを返します。
/* this: */
Dataset<Row> logData = spark_session.createDataFrame(Arrays.asList(
new LabeledPoint(1.0, Vectors.dense(4.9,3,1.4,0.2)),
new LabeledPoint(1.0, Vectors.dense(4.7,3.2,1.3,0.2))
), LabeledPoint.class);
/* or this: */
/* logFile: "C:\files\project\file.csv", "C:\\files\\project\\file.csv",
"C:/files/project/file.csv", "file:/C:/files/project/file.csv",
"file:///C:/files/project/file.csv", "/file.csv" */
Dataset<Row> logData = spark_session.read().csv(logFile);
は例外:
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:C:/files/project/spark-warehouse
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
at org.apache.hadoop.fs.Path.<init>(Path.java:172)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.makeQualifiedPath(SessionCatalog.scala:114)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:145)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:89)
at org.apache.spark.sql.internal.SessionState.catalog$lzycompute(SessionState.scala:95)
at org.apache.spark.sql.internal.SessionState.catalog(SessionState.scala:95)
at org.apache.spark.sql.internal.SessionState$$anon$1.<init>(SessionState.scala:112)
at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:112)
at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:111)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
at <call in my line of code>
どのように私はJavaコードからDataset<Row>
にcsvファイルを読み込むことができますか?
問題の重大度は「マイナー」です。この回避策は私のエラーを解決します、ありがとう! –