TermDocumentMatrix
関数を適用しようとすると、tm
パッケージに新しく、障害に遭遇しました。TermDocumentMatrixをtmパッケージに作成中にエラーが発生しました
:
myCorpus <- Corpus(VectorSource(posts$message))
myCorpus <- tm_map(myCorpus, content_transformer(tolower))
myCorpus <- tm_map(myCorpus, removePunctuation)
myCorpus <- tm_map(myCorpus, removeNumbers)
removeURL <- function(x) gsub("http[[:alnum:]]*", "", x)
myCorpus <- tm_map(myCorpus, removeURL)
myStopwords <- c(stopwords("english"))
myCorpus <- tm_map(myCorpus, removeWords, myStopwords)
myCorpusCopy <- myCorpus
myCorpus <- tm_map(myCorpus, stemDocument)
検査の際に文書のリストは、それがどうあるべきかであるかのように思える:しかし
> for(i in 1:5) {
+ cat(paste("[[", i, "]] ", sep =""))
+ writeLines(myCorpus[[i]])
+ }
[[1]] syntel recruitment drive week freshers newregistrationlink passout graduates
qualification graduatebebtechmcamemtech
syntel registration link
limited referrals available
comment emailids reference future job upd
[[2]] dont miss opportunity get placed one best mnc companies world ebay freshers week january
qualification graduate can apply
ebay registration link
comment emailids fast beacuse referrals left
[[3]] recent passouts eligible apply wipro go updated link lastday reference drive jan apply link fresher referral
apply link
go link apply asap
[[4]] robertbosch recruitment drive week freshers newregistrationlink passout graduates
qualification graduatebebtechmcamemtech
robertbosch registration link
limited referrals available
comment emailids reference future job upd
[[5]] mega job openings year
mphasis recruitment freshers january
qualification btech bsc bca graduates mca mba mtech post graduates
mphasis registration link
comment emailids comment box reference future job updates emailbox
、作成した後、幹の完成のためのコーパスのコピー、問題が発生します。
回避策の候補はありますか?