一致する2つのデータセットからの変数

のレベルIは、予測モデルに取り組んでおり、以下のように示すコードを使用して、トレーニングデータとテストデータから各変数のレベルを一致させる必要があるのです：一致する2つのデータセットからの変数

levels(test$MSSubClass) <- levels(train$MSSubClass)

87があります。合計で変数。それを一つずつ行う代わりに。私はそれらをすべて一度に一致させるアプローチを探しています。現在、私のコードは次のようである：

levels(test$MSSubClass) <- levels(train$MSSubClass) 
levels(test$MSZoning) <- levels(train$MSZoning) 
levels(test$LotFrontage) <- levels(train$LotFrontage) 
levels(test$LotArea) <- levels(train$LotArea) 
levels(test$Street) <- levels(train$Street) 
....

出典

2017-09-01 Ran Tao

FACT =（sapply（電車、クラス）== "要因" ）; （FACTのi）のための {levels（test [、i]）< - levels（train [、i]）} – G5W

testでレベルがtrain内のレベルのサブセットである場合：

# Get the column names that are factors 
factor_names <- names(train)[sapply(train, class) == "factor"] 

# Set the factor levels in test to be same as train 
for (name in factor_names) {levels(test[,name]) <- levels(train[,name])}

出典

2017-09-01 21:34:40

一致する2つのデータセットからの変数

答えて

関連する問題