2017-11-08 22 views
1

私はIris dataのケラでGridSearchCV機能を探そうとしていました。グリッド検索は、batch_sizeとepochに関するものです。しかし、結果の正確さには驚きましたが、原因を見つけることはできません。あなたの助けをありがとう!虹彩のケラを伴うGridSearchCVの結果が悪い

コードはここに添付され、出力されます。以下の行を追加し

from keras.models import Sequential 
from keras.layers import Dense 
from keras.utils import np_utils 
from keras.wrappers.scikit_learn import KerasClassifier 
import numpy 
import pandas as pd 
from sklearn.preprocessing import LabelEncoder 
from sklearn.model_selection import GridSearchCV 

# Function to create model, required for KerasClassifier 
def create_model(): 
    # create model 
    model = Sequential() 
    model.add(Dense(8, input_dim=4, activation='relu')) 
    model.add(Dense(3, activation='softmax')) 
    # Compile model 
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 
    return model 
# fix random seed for reproducibility 
seed = 7 
numpy.random.seed(seed) 

dataframe = pd.read_csv("iris.csv", header=None) 
dataset = dataframe.values 
X = dataset[:,0:4].astype(float) 
Y = dataset[:,4] 

# encode class values as integers 
encoder = LabelEncoder() 
encoder.fit(Y) 
encoded_Y = encoder.transform(Y) 
# convert integers to dummy variables (i.e. one hot encoded) 
dummy_y = np_utils.to_categorical(encoded_Y) 

# create model 
model = KerasClassifier(build_fn=create_model, verbose=0) 
# define the grid search parameters 
batch_size = [5, 10, 20, 40, 60, 80, 100] 
epochs = [10, 50, 100, 200] 
param_grid = dict(batch_size=batch_size, epochs=epochs) 
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1) 
grid_result = grid.fit(X, dummy_y) 
# summarize results 
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) 
means = grid_result.cv_results_['mean_test_score'] 
stds = grid_result.cv_results_['std_test_score'] 
params = grid_result.cv_results_['params'] 
for mean, stdev, param in zip(means, stds, params): 
    print("%f (%f) with: %r" % (mean, stdev, param)) 

Using TensorFlow backend. 
Best: 0.666667 using {'batch_size': 100, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 5, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 5, 'epochs': 50} 
0.000000 (0.000000) with: {'batch_size': 5, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 5, 'epochs': 200} 
0.000000 (0.000000) with: {'batch_size': 10, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 10, 'epochs': 50} 
0.000000 (0.000000) with: {'batch_size': 10, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 10, 'epochs': 200} 
0.006667 (0.009428) with: {'batch_size': 20, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 20, 'epochs': 50} 
0.000000 (0.000000) with: {'batch_size': 20, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 20, 'epochs': 200} 
0.333333 (0.471405) with: {'batch_size': 40, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 40, 'epochs': 50} 
0.000000 (0.000000) with: {'batch_size': 40, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 40, 'epochs': 200} 
0.006667 (0.009428) with: {'batch_size': 60, 'epochs': 10} 
0.013333 (0.018856) with: {'batch_size': 60, 'epochs': 50} 
0.000000 (0.000000) with: {'batch_size': 60, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 60, 'epochs': 200} 
0.000000 (0.000000) with: {'batch_size': 80, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 80, 'epochs': 50} 
0.000000 (0.000000) with: {'batch_size': 80, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 80, 'epochs': 200} 
0.666667 (0.471405) with: {'batch_size': 100, 'epochs': 10} 
0.000000 (0.000000) with: {'batch_size': 100, 'epochs': 50} 
0.040000 (0.056569) with: {'batch_size': 100, 'epochs': 100} 
0.000000 (0.000000) with: {'batch_size': 100, 'epochs': 200} 

答えて

1

試してみてください。あなたのデータがシャッフルされていなかったので、

from sklearn.metrics import shuffle 

X, Y = shuffle(X, Y) 

はこの奇妙な行動の背後にある理由がある - (3倍の相互検証中)たびに、あなたのデータが分割されました列車内に2つのクラスしかなく、3つ目のクラスは試験専用になっていた。詳細はhereをご参照ください。

+0

ありがとうございます!それは魅力のように働く。 from sklearn.utils import shuffle – maroontide

+0

私はupvotesに感謝します:) –

関連する問題