スパーク1.6.1 python 3.5.1建物naive bayesクラシファイア

私の質問はに基づいています。スパーク1.6.1 python 3.5.1建物naive bayesクラシファイア

それはより詳細なコメント可能でしょう/私は混同行列を印刷することができる方法ラインtf = HashingTF().transform(training_raw.map(lambda doc: doc["text"], preservesPartitioning=True))
を開始するコードを説明しますか？
以下のエラーは何を意味しますか？どうすれば修正できますか？このモデルはまだ構築されますと、私は新しい観測の結果を印刷することができどのような予測

>>> # Train and check ... model = NaiveBayes.train(training) [Stage 2:=============================> (2 + 2)/4]16/04/05 18:18:28 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 16/04/05 18:18:28 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
を取得します。私が試したとはスパークで

>>> model.predict("love") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "c:\spark-1.6.1-bin-hadoop2.6\spark-1.6.1-bin-hadoop2.6\python\pyspark\mllib\classification.py", line 594, in predict x = _convert_to_vector(x) File "c:\spark-1.6.1-bin-hadoop2.6\spark-1.6.1-bin-hadoop2.6\python\pyspark\mllib\linalg\__init__.py", line 77, in _convert_to_vector raise TypeError("Cannot convert type %s into Vector" % type(l)) TypeError: Cannot convert type <class 'str'> into Vector

2016-04-06 user2543622

'training_raw'のサンプルを追加できますか？ –

データはhttp://stackoverflow.com/questions/32231049/how-to-use-spark-naive-bayes-classifier-for-text-classification-with-idf – user2543622

1.hashingTFがscikitlearnのHashingVectorizerに似ていませんでした。 training_rawはテキストのrddです。pySparkの利用可能なベクトル化ツールの詳細については、Vectorizersを参照してください。完全な例については、this post

を参照してください。2.BLASは、Basic Linear Algebra Subprogramsライブラリです。潜在的な可能性については、githubのこのページをご覧ください。solution

3.文字列（ "愛"）にmodel.predictを使用しようとしています。最初に文字列をベクトルに変換する必要があります。密なベクトル文字列を取り、ラベルの付いた高密度ベクトルを出力する簡単な例は、おそらく疎ベクトルを探しているかもしれません。だから、Vectors.sparseを試してみてください。

出典

2016-04-06 01:58:02 goCards

2で、私はBLASの略語を理解しています。今。しかし、エラーを取り除くためのヒントを提供することは可能でしょうか？また、混乱の行列を印刷する方法を教えてください...ありがとう – user2543622

スパーク1.6.1 python 3.5.1建物naive bayesクラシファイア

答えて

関連する問題