0

私のマシンにCassandraとSparkをインストールしました。 スパークSQLのサポートは、キーワードSparkSQLを使用してcassandraの2つのテーブルを結合する - エラー:EOFがありません

https://docs.datastax.com/en/datastax_enterprise/4.6/datastax_enterprise/spark/sparkSqlSupportedSyntax.html

Supported syntax of Spark SQL The following syntax defines a SELECT query.

SELECT [DISTINCT] [column names]|[wildcard] FROM [kesypace name.]table name [JOIN clause table name ON join condition] [WHERE condition] [GROUP BY column name] [HAVING conditions] [ORDER BY column names [ASC | DSC]]

JOINを私は次のコード

SparkConf conf = new SparkConf().setAppName("My application").setMaster("local"); 
conf.set("spark.cassandra.connection.host", "localhost"); 
JavaSparkContext sc = new JavaSparkContext(conf); 
CassandraConnector connector = CassandraConnector.apply(sc.getConf()); 
Session session = connector.openSession(); 

ResultSet results; 
String sql =""; 


BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in)); 
sql = "SELECT * from siem.report JOIN siem.netstat on siem.report.REPORTUUID = siem.netstat.NETSTATREPORTUUID ALLOW FILTERING;"; 
results = session.execute(sql); 

を持って、私は次のエラー

Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:25 missing EOF at ',' (SELECT * from siem.report[,] siem...) 11:14 AM com.datastax.driver.core.exceptions.SyntaxError: line 1:25 missing EOF at ',' (SELECT * from siem.report[,] siem...) at com.datastax.driver.core.exceptions.SyntaxError.copy(SyntaxError.java:58) at com.datastax.driver.core.exceptions.SyntaxError.copy(SyntaxError.java:24) at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245) at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:63) at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:39) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33) at com.sun.proxy.$Proxy59.execute(Unknown Source) at com.ge.predix.rmd.siem.boot.PersistenceTest.test_QuerySparkOnReport_GIACOMO_LogDao(PersistenceTest.java:178) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:73) at org.springframework.test.context.junit4.statements

てみました取得また

SELECT * from siem.report R JOIN siem.netstat N on R.REPORTUUID = N.NETSTATREPORTUUID ALLOW FILTERING 

で試してみました

SELECT * from siem.report JOIN siem.netstat on report.REPORTUUID = netstat.NETSTATREPORTUUID ALLOW FILTERING 

誰かが私を助けることができますか?私は本当にSparkSQLまたはCQLを使用していますか?

UPDATE

私は

public void test_JOIN_on_Cassandra() { 

     SparkConf conf = new SparkConf().setAppName("My application").setMaster("local"); 
     conf.set("spark.cassandra.connection.host", "localhost"); 
     JavaSparkContext sc = new JavaSparkContext(conf); 


     SQLContext sqlContext = new SQLContext(sc); 
     try { 
      //QueryExecution test1 = sqlContext.executeSql("SELECT * from siem.report"); 
      //QueryExecution test2 = sqlContext.executeSql("SELECT * from siem.report JOIN siem.netstat on report.REPORTUUID = netstat.NETSTATREPORTUUID"); 
      QueryExecution test3 = sqlContext.executeSql("SELECT * from siem.report JOIN siem.netstat on siem.report.REPORTUUID = siem.netstat.NETSTATREPORTUUID"); 

     } catch (Exception e) { 
      e.printStackTrace(); 
     } 

     // SchemaRDD results = sc.sql("SELECT * from siem.report JOIN siem.netstat on siem.report.REPORTUUID = siem.netstat.NETSTATREPORTUUID"); 

} 

を試してみましたが、私はあなたがエラーを作成している、ここでカップルの概念をブレンドしているように見えます

== Parsed Logical Plan == 'Project [unresolvedalias()] +- 'Join Inner, Some(('siem.report.REPORTUUID = 'siem.netstat.NETSTATREPORTUUID)) :- 'UnresolvedRelation siem . report , None +- 'UnresolvedRelation siem . netstat , None == Analyzed Logical Plan == org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to toAttribute on unresolved object, tree: unresolvedalias() 'Project [unresolvedalias(*)] +- 'Join Inner, Some(('siem.report.REPORTUUID = 'siem.netstat.NETSTATREPORTUUID)) :- 'UnresolvedRelation siem . report , None +- 'UnresolvedRelation siem . netstat , None == Optimized Logical Plan == org.apache.spark.sql.AnalysisException: Table not found: siem . report ; == Physical Plan == org.apache.spark.sql.AnalysisException: Table not found: siem . report ;

答えて

2

を取得します。あなたが作成しているセッションは、Cassandraへの直行を開きます。つまり、SQLではなくCQLを受け入れます。あなたがSQLを実行したい場合は、小さな変更

SparkConf conf = new SparkConf().setAppName("My application").setMaster("local"); 
conf.set("spark.cassandra.connection.host", "localhost"); 
JavaSparkContext sc = new JavaSparkContext(conf); 

SchemaRDD results = sparkContext.sql("SELECT * from siem.report JOIN siem.netstat on siem.report.REPORTUUID = siem.netstat.NETSTATREPORTUUID"); 

を作ることができる代わりに、直接、カサンドラに接続するのスパークコンテキストからSparkSQLを呼び出します。その他の情報:http://docs.datastax.com/en/latest-dse/datastax_enterprise/spark/sparkSqlJava.html

関連する問題