scikitはのは、私がデータセットを持っているとしましょう

を学ぶ使用してターゲットラベルを予測するために、どのように、私は...このインスタンスで発生scikitはのは、私がデータセットを持っているとしましょう

data = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD')) 
target = "A"

を... ...おもちゃの例を提供します

 A B C D 
    0 75 38 81 58 
    1 36 92 80 79 
    2 22 40 19 3 
     ... ...

これは明らかに良い精度を与えるために十分なデータではありませんが、それにもかかわらず、のは、私はを学ぶscikitにより提供さrandom forestアルゴリズムにdataとtargetを養うとしましょう...

def random_forest(target, data): 

    # Drop the target label, which we save separately. 
    X = data.drop([target], axis=1).values 
    y = data[target].values 

    # Run Cross Validation on Random Forest Classifier. 
    clf_tree = ske.RandomForestClassifier(n_estimators=50) 
    unique_permutations_cross_val(X, y, clf_tree)

unique_permutations_cross_valは、私の主な質問は、とにかく...単に

def unique_permutations_cross_val(X, y, model): 

    # Split data 20/80 to be used in a K-Fold Cross Validation with unique permutations. 
    shuffle_validator = model_selection.ShuffleSplit(n_splits=10, test_size=0.2, random_state=0) 

    # Calculate the score of the model after Cross Validation has been applied to it. 
    scores = model_selection.cross_val_score(model, X, y, cv=shuffle_validator) 

    # Print out the score (mean), as well as the variance. 
    print("Accuracy: %0.4f (+/- %0.2f)" % (scores.mean(), scores.std()))

私は、これは機能（それは同様にモデルの精度をプリントアウト）で作られたクロスバリデーション機能であります私が作成したこのモデルを使ってどのようにターゲットラベルを予測できますか？たとえば、モデル[28, 12, 33]にフィードを送ったとします。このモデルではtargetを予測するモデルをこの場合は"A"にします。

出典

2017-11-21 Bolboa

このモデルはまだ掲載されていません。クロスバリデーションを実行しました。これは、モデルがデータに対してどれほどうまくトレーニングされているかを示しますが、モデルオブジェクトには適していません。 cross_val_score()は、提供されたモデルオブジェクトのクローンを使用してスコアを検索します。

データを予測するには、モデルでfit()を明示的に呼び出す必要があります。

random_forestの方法を編集して、適合モデルを返すことができます。その後、

unique_permutations_cross_val(X, y, clf_tree) 
clf_tree.fit(X, y) 
return clf_tree

そして、あなたはrandom_forestメソッドを呼び出しているところはどこでも、あなたがこれを行うことができます：このような何か

fitted_model = random_forest(target, data) 

predictions = fitted_model.predict([data to predict])

を

出典

2017-11-21 01:56:24

scikitはのは、私がデータセットを持っているとしましょう

答えて

関連する問題