unnest_tokensとそのエラー（ ""）

私はtidytextで作業しています。私にunnest_tokensを命じるとき。 Rは、エラーunnest_tokensとそのエラー（ ""）

を返すどのように私はこのエラーを解決することができ

列名を入力してください？

library(tidytext) 
library(tm) 
library(dplyr) 
library(stats) 
library(base) 
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# 
    #Build a corpus: a collection of statements 
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# 
f <-Corpus(DirSource("C:/Users/Boon/Desktop/Dissertation/F")) 
doc_dir <- "C:/Users/Boon/Desktop/Dis/F/f.csv" 
doc <- read.csv(file_loc, header = TRUE) 
docs<- Corpus(DataframeSource(doc)) 
dtm <- DocumentTermMatrix(docs) 
text_df<-data_frame(line=1:115,docs=docs) 

#This is the output from the code above,which is fine!: 
# text_df 
# A tibble: 115 x 2 
#line   docs 
#<int> <S3: VCorpus> 
# 1  1 <S3: VCorpus> 
#2  2 <S3: VCorpus> 
#3  3 <S3: VCorpus> 
#4  4 <S3: VCorpus> 
#5  5 <S3: VCorpus> 
#6  6 <S3: VCorpus> 
#7  7 <S3: VCorpus> 
#8  8 <S3: VCorpus> 
#9  9 <S3: VCorpus> 
#10 10 <S3: VCorpus> 
# ... with 105 more rows 

unnest_tokens(word, docs) 

# Error: Please supply column name

出典

2017-07-20 Sapphasak Chatchawan

http://stackoverflow.com/help/mcve必要 –

'unnest_tokens（tib = text_df、output = words、input = docs）'のように、最初の引数を持つデータを参照するには – Nate

Nate様、ご協力いただきありがとうございます。それは働いているようだ。 unnest_tokens_（TBL、output_col、input_col、トークン=トークン、 to_lower = to_lowerで –

あなたがきちんと形式にテキストデータを変換したい場合は、コーパスまたは文書の長期的なマトリックスまたは最初の何にそれを変換する必要はありません。それは、テキストのための整頓されたデータフォーマットを使用する背後にある主なアイデアの1つです。モデリングの必要がない限り、他のフォーマットは使用しません。

生のテキストをデータフレームに入れてから、unnest_tokens()を使って整理します。（私はここであなたのCSVがどのように見えるかについていくつかの仮定を作っています。reproducible example次回を投稿するより役立つだろう。）

library(dplyr) 

docs <- data_frame(line = 1:4, 
        document = c("This is an excellent document.", 
           "Wow, what a great set of words!", 
           "Once upon a time...", 
           "Happy birthday!")) 

docs 
#> # A tibble: 4 x 2 
#> line      document 
#> <int>       <chr> 
#> 1  1 This is an excellent document. 
#> 2  2 Wow, what a great set of words! 
#> 3  3    Once upon a time... 
#> 4  4     Happy birthday! 

library(tidytext) 

docs %>% 
    unnest_tokens(word, document) 
#> # A tibble: 18 x 2 
#>  line  word 
#> <int>  <chr> 
#> 1  1  this 
#> 2  1  is 
#> 3  1  an 
#> 4  1 excellent 
#> 5  1 document 
#> 6  2  wow 
#> 7  2  what 
#> 8  2   a 
#> 9  2  great 
#> 10  2  set 
#> 11  2  of 
#> 12  2  words 
#> 13  3  once 
#> 14  3  upon 
#> 15  3   a 
#> 16  3  time 
#> 17  4  happy 
#> 18  4 birthday

出典

2017-07-21 18:47:49

文書用語行列にデータを実際に持っているなら（例えばtmから）、あなたがしたいのは['tidy（）']（http ：//tidytextmining.com/dtm.html）、 'unnest_tokens（）'を使用しないでください。 –

ありがとうJulia :) –

unnest_tokensとそのエラー（ ""）

答えて

関連する問題