2017-12-08 4 views
1

[Iは、同様のスレッドherein githubに見て、マックスと他の人によって提案された問題のどれも私の場合に関係しているように見えるん。]式()

数式インタフェースが失敗したことを報告し、非数式インタフェースは正常に動作します。私の問題は反対です。式インタフェースを備えた以下train()機能は完璧に動作します:

glmTune <- train(class ~ ., 
       data = trainData, 
       method = "glmnet", 
       trControl = train.control, 
       tuneGrid = tune.grid) 

以下この1つはNAエラーを与える:

predictors <- trainData[, !(names(trainData) %in% "class")] 
response <- trainData$class 
glmTune <- train(x = predictors, 
       y = response, 
       method = "glmnet", 
       trControl = train.control, 
       tuneGrid = tune.grid) 

これはglmnetxgboost,の両方で発生し、かどうかに関係なくyの要因または数値が、xです因子変数が多い助けてくれてありがとう。

Something is wrong; all the Accuracy metric values are missing: 
    Accuracy  Kappa  
Min. : NA Min. : NA 
1st Qu.: NA 1st Qu.: NA 
Median : NA Median : NA 
Mean :NaN Mean :NaN 
3rd Qu.: NA 3rd Qu.: NA 
Max. : NA Max. : NA 
NA's :243 NA's :243 
Error: Stopping 
In addition: Warning message: 
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : 
    There were missing values in resampled performance measures. 

数値yことがほんのわずかに異なる(異なる性能メトリック):

因子yのエラーこの場合、追加したいここ

Something is wrong; all the RMSE metric values are missing: 
     RMSE  Rsquared 
Min. : NA Min. : NA 
1st Qu.: NA 1st Qu.: NA 
Median : NA Median : NA 
Mean :NaN Mean :NaN 
3rd Qu.: NA 3rd Qu.: NA 
Max. : NA Max. : NA 
NA's :100 NA's :100 
Error: Stopping 
In addition: Warning message: 
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : 
    There were missing values in resampled performance measures. 

コードです。

library(caret) 
library(dplyr) 
library(glmnet) 

# see dput(droplevels(head(df, 20))) output of data below: 

# 70%/30% split 
set.seed(42) 
inTrain <- createDataPartition(df$lnprice, p=0.7, list=F) 
trainData <- df[inTrain, ] 
testData <- df[-inTrain, ] 

# train model 
train.control <- trainControl(method = "repeatedcv", 
           number = 10, 
           repeats= 5, 
           allowParallel = F) 
tune.grid <- expand.grid(lambda = seq(0.0001,0.1,length=20), 
         alpha = c(0, 0.5, 1)) 
X <- trainData[, !(names(trainData) %in% "lnprice")] 
Y <- trainData$lnprice 
fit <- train(
# x = X, y = Y,      # non-formula 
    lnprice ~ ., data = trainData,  # formula 
    method = "glmnet", 
    preProcess = c("zv", "center", "scale"), 
    tuneGrid = tune.grid, 
    trControl = train.control) 

# plot model 
print(plot(fit)) 

> dput(droplevels(head(df,20))) 
structure(list(fuel.type = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "gas", class = "factor"), 
    aspiration = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("std", 
    "turbo"), class = "factor"), num.of.doors = structure(c(2L, 
    2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 
    2L, 1L, 2L, 2L), .Label = c("four", "two"), class = "factor"), 
    body.style = structure(c(1L, 1L, 2L, 3L, 3L, 3L, 3L, 4L, 
    3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L), .Label = c("convertible", 
    "hatchback", "sedan", "wagon"), class = "factor"), drive.wheels = structure(c(2L, 
    2L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 1L, 1L), .Label = c("fwd", "rwd", "X4wd"), class = "factor"), 
    engine.location = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "front", class = "factor"), 
    wheel.base = c(88.6, 88.6, 94.5, 99.8, 99.4, 99.8, 105.8, 
    105.8, 105.8, 99.5, 101.2, 101.2, 101.2, 101.2, 103.5, 103.5, 
    103.5, 110, 88.4, 94.5), length = c(168.8, 168.8, 171.2, 
    176.6, 176.6, 177.3, 192.7, 192.7, 192.7, 178.2, 176.8, 176.8, 
    176.8, 176.8, 189, 189, 193.8, 197, 141.1, 155.9), width = c(64.1, 
    64.1, 65.5, 66.2, 66.4, 66.3, 71.4, 71.4, 71.4, 67.9, 64.8, 
    64.8, 64.8, 64.8, 66.9, 66.9, 67.9, 70.9, 60.3, 63.6), height = c(48.8, 
    48.8, 52.4, 54.3, 54.3, 53.1, 55.7, 55.7, 55.9, 52, 54.3, 
    54.3, 54.3, 54.3, 55.7, 55.7, 53.7, 56.3, 53.2, 52), curb.weight = c(2548L, 
    2548L, 2823L, 2337L, 2824L, 2507L, 2844L, 2954L, 3086L, 3053L, 
    2395L, 2395L, 2710L, 2765L, 3055L, 3230L, 3380L, 3505L, 1488L, 
    1874L), engine.type = structure(c(1L, 1L, 4L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L), .Label = c("dohc", 
    "l", "ohc", "ohcv"), class = "factor"), num.of.cylinders = structure(c(2L, 
    2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 
    1L, 1L, 2L, 2L), .Label = c("five.six", "four.or.less"), class = "factor"), 
    engine.size = c(130L, 130L, 152L, 109L, 136L, 136L, 136L, 
    136L, 131L, 131L, 108L, 108L, 164L, 164L, 164L, 209L, 209L, 
    209L, 61L, 90L), fuel.system = structure(c(1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
    2L), .Label = c("mpfi", "X2bbl"), class = "factor"), bore = c(3.47, 
    3.47, 2.68, 3.19, 3.19, 3.19, 3.19, 3.19, 3.13, 3.13, 3.5, 
    3.5, 3.31, 3.31, 3.31, 3.62, 3.62, 3.62, 2.91, 3.03), stroke = c(2.68, 
    2.68, 3.47, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 2.8, 2.8, 
    3.19, 3.19, 3.19, 3.39, 3.39, 3.39, 3.03, 3.11), compression.ratio = c(9, 
    9, 9, 10, 8, 8.5, 8.5, 8.5, 8.3, 7, 8.8, 8.8, 9, 9, 9, 8, 
    8, 8, 9.5, 9.6), horsepower = c(111, 111, 154, 102, 115, 
    110, 110, 110, 140, 160, 101, 101, 121, 121, 121, 182, 182, 
    182, 48, 70), peak.rpm = c(5000L, 5000L, 5000L, 5500L, 5500L, 
    5500L, 5500L, 5500L, 5500L, 5500L, 5800L, 5800L, 4250L, 4250L, 
    4250L, 5400L, 5400L, 5400L, 5100L, 5400L), city.mpg = c(21L, 
    21L, 19L, 24L, 18L, 19L, 19L, 19L, 17L, 16L, 23L, 23L, 21L, 
    21L, 20L, 16L, 16L, 15L, 47L, 38L), highway.mpg = c(27L, 
    27L, 26L, 30L, 22L, 25L, 25L, 25L, 20L, 22L, 29L, 29L, 28L, 
    28L, 25L, 22L, 22L, 20L, 53L, 43L), make = structure(c(1L, 
    1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 4L, 4L), .Label = c("alfa.romero", "audi", "bmw", 
    "chevrolet"), class = "factor"), lnprice = c(9.5101, 9.7111, 
    9.7111, 9.5432, 9.7671, 9.6323, 9.7819, 9.848, 10.0806, 9.69176, 
    9.7069, 9.7365, 9.9508, 9.9573, 10.1091, 10.334, 10.629, 
    10.5154, 8.5469, 8.7475)), .Names = c("fuel.type", "aspiration", 
"num.of.doors", "body.style", "drive.wheels", "engine.location", 
"wheel.base", "length", "width", "height", "curb.weight", "engine.type", 
"num.of.cylinders", "engine.size", "fuel.system", "bore", "stroke", 
"compression.ratio", "horsepower", "peak.rpm", "city.mpg", "highway.mpg", 
"make", "lnprice"), row.names = c(NA, 20L), class = "data.frame") 
+0

テストするための小さな再現可能な例を教えてください。 – topepo

+0

オリジナルの投稿の下にコードを貼り付けました。これは、自動価格設定データ(「lnprice」列に格納されたlog(price))を使用する回帰モデルのためのもので、応答変数です。データ全体には205行24列あり、データを与える方法がわからないので、代わりに命令に従って、 'outputのRコンソール出力を使って最初の20行を貼り付けようとしました(液滴(head(df 、20))) 'これが助けてくれることを願っています。そして、これを調べてくれてありがとう。 – Manojit

答えて

0

Strange。 train.defaultメソッドにデフォルトのna.actionハンドラがないようですか? train.formula方法一方?caret::train

## Default S3 method: 
train(x, y, method = "rf", preProcess = NULL, ..., 
    weights = NULL, metric = ifelse(is.factor(y), "Accuracy", "RMSE"), 
    maximize = ifelse(metric %in% c("RMSE", "logLoss", "MAE"), FALSE, TRUE), 
    trControl = trainControl(), tuneGrid = NULL, 
    tuneLength = ifelse(trControl$method == "none", 1, 3)) 

から

出力はありません:あなたはtrain.default呼び出し、x, yインタフェースにna.action = na.failを追加する場合

## S3 method for class 'formula' 
train(form, data, ..., weights, subset, na.action = na.fail, contrasts = NULL) 
             ^^^^^^^^^^^^^^^^^^^ 

、あなたはtrain.formulaコールと同じ動作を得るのですか?

+0

これを調べていただきありがとうございます。私はちょうどあなたの提案を試みた。同じ正確なNAエラー。 – Manojit

+0

'NA'sを無視したいなら、' na.action = na.pass'を使うことができます。しかし、なぜ2つの方法が異なってそれを扱うのか、本当に説明していない。 – hrabel

+0

遅れて申し訳ありません - 私は 'na.action = na.pass、 'と同じエラーで再確認しました。もし私の元の投稿に追加されたエラーメッセージが何か手がかりを持っているのであれば驚くべきことです。再度、感謝します。 – Manojit