別のアプローチは、可能性が
#sample data (to understand this approach better I have slightly modified your input data)
mat <- Matrix(data = c(1,0,0,0,0,1,1,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,1), nrow = 5, ncol = 5,
dimnames = list(c("P1","P2","P3","P4","P5"),c("P1","P2","P3","P4","P5")),
sparse = TRUE)
mat
#create dataframe having relationship among similar products
mat_summary <- summary(mat)
df <- data.frame(Product_Name = rownames(mat)[mat_summary$i],
Similar_Product_Name = colnames(mat)[mat_summary$j])
df <- df[df$Product_Name != df$Similar_Product_Name, ]
df
#clustering - to get the final result
library(igraph)
library(data.table)
df.g <- graph.data.frame(df)
final_df <- setNames(setDT(as.data.frame(clusters(df.g)$membership), keep.rownames = TRUE)[], c('Product', 'Product_Cluster'))
final_df
出力は次のようになります。
Product Product_Cluster
1: P1 1
2: P4 1
3: P2 1
4: P3 2
5: P5 2
がP4に似P3はありませんか?それはロジックごとにすべて同じになります。 – Prem
申し訳ありませんが、コピーして貼り付けて元のアイデアを変更しました。 P3は他のものと似ていてはいけません –