ngramから得られたいくつかのテキストのリストを得て、元のデータテーブルに列として追加したい。ngramテキストがRの別の列になるようにする
> prep_test
prep_test
1: Women Athletic,Athletic Apparel,Apparel Pants,Pants Tights,Tights Leggings
2: Beauty Makeup,Makeup Face
3: Beauty Makeup,Makeup Face
4: Electronics Cell,Cell Phones,Phones Accessories,Accessories Cases,Cases Covers,Covers Skins
5: Women Shoes,Shoes Boots
6: Men Men,Men s,s Accessories,Accessories Belts
7: Electronics Cell,Cell Phones,Phones Accessories,Accessories Cell,Cell Phones,Phones Smartphones
8: Women Tops,Tops Blouses,Blouses Other
9: Women Athletic,Athletic Apparel,Apparel Pants,Pants Tights,Tights Leggings
10: Home Home,Home DÃ,DÃ cor,cor Home,Home Fragrance
str(prep_test)
Classes ‘data.table’ and 'data.frame': 10 obs. of 1 variable:
$ prep_test:List of 10
..$ : chr "Women Athletic" "Athletic Apparel" "Apparel Pants" "Pants Tights" ...
..$ : chr "Beauty Makeup" "Makeup Face"
..$ : chr "Beauty Makeup" "Makeup Face"
..$ : chr "Electronics Cell" "Cell Phones" "Phones Accessories" "Accessories Cases" ...
..$ : chr "Women Shoes" "Shoes Boots"
..$ : chr "Men Men" "Men s" "s Accessories" "Accessories Belts"
..$ : chr "Electronics Cell" "Cell Phones" "Phones Accessories" "Accessories Cell" ...
..$ : chr "Women Tops" "Tops Blouses" "Blouses Other"
..$ : chr "Women Athletic" "Athletic Apparel" "Apparel Pants" "Pants Tights" ...
..$ : chr "Home Home" "Home DÃ" "DÃ cor" "cor Home" ...
- attr(*, ".internal.selfref")=<externalptr>
現在のコードは、ここで
bigram_fun <- function(y){
y <- gsub("[[:punct:][:blank:]]+", " ", y)
y <- ngram_asweka(y, min=2, max=2)
#y <- str_split_fixed(y, ",", n=Inf)
#y <- unlist(y)
return(y)
}
prep_test <- all[1:10, 9]
prep_test <- apply(prep_test, 1, bigram_fun)
prep_test <- data.table(prep_test)
prep_test
dput
> dput(prep_test)
list(c("Women Athletic", "Athletic Apparel", "Apparel Pants",
"Pants Tights", "Tights Leggings"), c("Beauty Makeup", "Makeup Face"
), c("Beauty Makeup", "Makeup Face"), c("Electronics Cell", "Cell Phones",
"Phones Accessories", "Accessories Cases", "Cases Covers", "Covers Skins"
), c("Women Shoes", "Shoes Boots"), c("Men Men", "Men s", "s Accessories",
"Accessories Belts"), c("Electronics Cell", "Cell Phones", "Phones Accessories",
"Accessories Cell", "Cell Phones", "Phones Smartphones"), c("Women Tops",
"Tops Blouses", "Blouses Other"), c("Women Athletic", "Athletic Apparel",
"Apparel Pants", "Pants Tights", "Tights Leggings"), c("Home Home",
"Home DÃ", "DÃ cor", "cor Home", "Home Fragrance"))
所望の結果列のnグラムを生成する
Bigram 1 Bigram 2 Bigram 3 Bigram 4 ...
"Women Athletic" "Athletic Apparel" "Apparel Pants" "Pants Tights"...
"Beauty Makeup" "Makeup Face" NA NA ...
"Beauty Makeup" "Makeup Face" NA NA ...
"Electronics Cell" "Cell Phones" "Phones Accessories" "Accessories Cases"
"Women Shoes" "Shoes Boots" NA NA
どんな答えを感謝し、ここで
あなたのコードのいずれかが – Chris
'prep_test'は、あなたの質問でdata.tableオブジェクトで再現可能であるように、'あなたのデータのdput'をアップロードします。しかし、あなたの 'dput'にはデータテーブルではなく、リストが含まれています。何か不足していますか? – jazzurro