dplyr

読むhttp://tidytextmining.com/tidytext.html状態でストップワードの削除：テキスト分析でしばしばdplyr

「

を、我々はストップワードを削除したいと思うでしょう。は一般的に、分析のために有用ではない言葉ですストップワード極めて共通な「」、「の」などの単語、「に」、など英語で。我々は（でanti_joinを（tidytextデータセットstop_wordsに保管）ストップワードを削除することができます）。

のデータ（stop_words）

tidy_books < - tidy_books％>％anti_join（stop_words）

「私は文字列からストップワードを削除するために変更しようとしています

：

data(stop_words) 
str_v <- paste(c("this is a test")) 
str_v <- str_v %>% 
    anti_join(stop_words)

がエラーを返します。

Error in UseMethod("anti_join") : 
    no applicable method for 'anti_join' applied to an object of class "character"

str_vをメソッドanti_joinを含むクラスに変換する必要はありませんか？

出典

2017-11-28 blue-sky

str_vはベクターである。それはas.tibbleを使用してdata.frameまたはtibbleに変換する必要があります。unnest_tokensを使用すると、 'value'列が単語に分割され、 'word'という名前に変更されるため、anti_joinの共通列が一致して結合されます。「単語」によって

library(tidytext) 
library(tibble) 
library(dplyr) 
str_v %>% 
    as.tibble %>% 
    unnest_tokens(word, value) %>% 
    anti_join(stop_words) 
# A tibble: 1 x 1 
# word 
# <chr> 
#1 test

出典

2017-11-28 15:48:43 akrun

答えて

関連する問題