TensorFlow学習率減衰 - 減衰のステップ数を正しく供給するにはどうすればよいですか？

私はTensorFlowの私の深いネットワークを訓練しています。私はそれを使って学習率の減衰を試みています。私が知る限りは、train.exponential_decay関数を使用してください。これは、さまざまなパラメータを使用して、現在のトレーニングステップの適切な学習率値を計算します。私はちょうど今実行されているステップを提供する必要があります。私はネットワークに何かを提供する必要があるときにいつものようにtf.placeholder（tf.int32）を使うべきだと思っていましたが、間違っているようです。私がこれを行うと、以下のエラーが表示されます：TensorFlow学習率減衰 - 減衰のステップ数を正しく供給するにはどうすればよいですか？

TypeError: Input 'ref' of 'AssignAdd' Op requires l-value input

私は間違っていますか？残念ながら、私は衰退したネットワークトレーニングの良い例を見つけることはできませんでした。私のコード全体は以下の通りです。ネットワークには2つの隠れたReLUレイヤーがあり、重みにL2ペナルティがあり、両方の隠れたレイヤーにドロップアウトがあります。

#We try the following - 2 ReLU layers 
#Dropout on both of them 
#Also L2 regularization on them 
#and learning rate decay also 


#batch size for SGD 
batch_size = 128 
#beta parameter for L2 loss 
beta = 0.001 

#that's how many hidden neurons we want 
num_hidden_neurons = 1024 

#learning rate decay 
#starting value, number of steps decay is performed, 
#size of the decay 
start_learning_rate = 0.05 
decay_steps = 1000 
decay_size = 0.95 

#building tensorflow graph 
graph = tf.Graph() 
with graph.as_default(): 
    # Input data. For the training data, we use a placeholder that will be fed 
    # at run time with a training minibatch. 
    tf_train_dataset = tf.placeholder(tf.float32, 
            shape=(batch_size, image_size * image_size)) 
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) 
    tf_valid_dataset = tf.constant(valid_dataset) 
    tf_test_dataset = tf.constant(test_dataset) 

    #now let's build our first hidden layer 
    #its weights 
    hidden_weights_1 = tf.Variable(
    tf.truncated_normal([image_size * image_size, num_hidden_neurons])) 
    hidden_biases_1 = tf.Variable(tf.zeros([num_hidden_neurons])) 

    #now the layer 1 itself. It multiplies data by weights, adds biases 
    #and takes ReLU over result 
    hidden_layer_1 = tf.nn.relu(tf.matmul(tf_train_dataset, hidden_weights_1) + hidden_biases_1) 

    #add dropout on hidden layer 1 
    #we pick up the probabylity of switching off the activation 
    #and perform the switch off of the activations 
    keep_prob = tf.placeholder("float") 
    hidden_layer_drop_1 = tf.nn.dropout(hidden_layer_1, keep_prob) 

    #now let's build our second hidden layer 
    #its weights 
    hidden_weights_2 = tf.Variable(
    tf.truncated_normal([num_hidden_neurons, num_hidden_neurons])) 
    hidden_biases_2 = tf.Variable(tf.zeros([num_hidden_neurons])) 

    #now the layer 2 itself. It multiplies data by weights, adds biases 
    #and takes ReLU over result 
    hidden_layer_2 = tf.nn.relu(tf.matmul(hidden_layer_drop_1, hidden_weights_2) + hidden_biases_2) 

    #add dropout on hidden layer 2 
    #we pick up the probabylity of switching off the activation 
    #and perform the switch off of the activations 
    hidden_layer_drop_2 = tf.nn.dropout(hidden_layer_2, keep_prob) 

    #time to go for output linear layer 
    #out weights connect hidden neurons to output labels 
    #biases are added to output labels 
    out_weights = tf.Variable(
    tf.truncated_normal([num_hidden_neurons, num_labels])) 

    out_biases = tf.Variable(tf.zeros([num_labels])) 

    #compute output 
    #notice that upon training we use the switched off activations 
    #i.e. the variaction of hidden_layer with the dropout active 
    out_layer = tf.matmul(hidden_layer_drop_2,out_weights) + out_biases 
    #our real output is a softmax of prior result 
    #and we also compute its cross-entropy to get our loss 
    #Notice - we introduce our L2 here 
    loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    out_layer, tf_train_labels) + 
    beta*tf.nn.l2_loss(hidden_weights_1) + 
    beta*tf.nn.l2_loss(hidden_biases_1) + 
    beta*tf.nn.l2_loss(hidden_weights_2) + 
    beta*tf.nn.l2_loss(hidden_biases_2) + 
    beta*tf.nn.l2_loss(out_weights) + 
    beta*tf.nn.l2_loss(out_biases))) 

    #variable to count number of steps taken 
    global_step = tf.placeholder(tf.int32) 

    #compute current learning rate 
    learning_rate = tf.train.exponential_decay(start_learning_rate, global_step, decay_steps, decay_size) 
    #use it in optimizer 
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step) 

    #nice, now let's calculate the predictions on each dataset for evaluating the 
    #performance so far 
    # Predictions for the training, validation, and test data. 
    train_prediction = tf.nn.softmax(out_layer) 
    valid_relu_1 = tf.nn.relu( tf.matmul(tf_valid_dataset, hidden_weights_1) + hidden_biases_1) 
    valid_relu_2 = tf.nn.relu( tf.matmul(valid_relu_1, hidden_weights_2) + hidden_biases_2) 
    valid_prediction = tf.nn.softmax(tf.matmul(valid_relu_2, out_weights) + out_biases) 

    test_relu_1 = tf.nn.relu(tf.matmul(tf_test_dataset, hidden_weights_1) + hidden_biases_1) 
    test_relu_2 = tf.nn.relu(tf.matmul(test_relu_1, hidden_weights_2) + hidden_biases_2) 
    test_prediction = tf.nn.softmax(tf.matmul(test_relu_2, out_weights) + out_biases) 



#now is the actual training on the ANN we built 
#we will run it for some number of steps and evaluate the progress after 
#every 500 steps 

#number of steps we will train our ANN 
num_steps = 3001 

#actual training 
with tf.Session(graph=graph) as session: 
    tf.initialize_all_variables().run() 
    print("Initialized") 
    for step in range(num_steps): 
    # Pick an offset within the training data, which has been randomized. 
    # Note: we could use better randomization across epochs. 
    offset = (step * batch_size) % (train_labels.shape[0] - batch_size) 
    # Generate a minibatch. 
    batch_data = train_dataset[offset:(offset + batch_size), :] 
    batch_labels = train_labels[offset:(offset + batch_size), :] 
    # Prepare a dictionary telling the session where to feed the minibatch. 
    # The key of the dictionary is the placeholder node of the graph to be fed, 
    # and the value is the numpy array to feed to it. 
    feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels, keep_prob : 0.5, global_step: step} 
    _, l, predictions = session.run(
     [optimizer, loss, train_prediction], feed_dict=feed_dict) 
    if (step % 500 == 0): 
     print("Minibatch loss at step %d: %f" % (step, l)) 
     print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels)) 
     print("Validation accuracy: %.1f%%" % accuracy(
     valid_prediction.eval(), valid_labels)) 
     print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

出典

2016-07-10 Maxim Haytovich

代わりglobal_stepのためのプレースホルダを使用しての、Variableを使用してみてください。

global_step = tf.Variable(0)

あなたはfeed_dictからglobal_stepを削除する必要があります。 global_stepを手動で増分する必要はないことに注意してください。テンソルフローは自動的にそれを行います。

出典

2016-07-10 23:17:39 BiBi

Errrr ...この方法でうまく動作しますが、なぜですか？なぜそれが機能するのかについていくつかの情報がありますか？私はマニュアルから手動で増やす必要があるという印象を受けましたが、そうではありませんか？ –

@MaximHaytovich手動でもTFでも自動的に行うこともできます。 'minimize'への呼び出しでglobal_step = global_stepを渡すと、TFに自動的に更新されます。これがあなたのケースで失敗する理由です。変数アプローチを使用する場合は、feed_dictに渡す必要はありません。そうしないと、最小化するために渡すべきではありません。それを変数に入れることは、チェックポイントから再開するためには良いことです。そのようにすることをお勧めします。 – etarion

@etarionありがとう、今私はそれを得た。 –

TensorFlow学習率減衰 - 減衰のステップ数を正しく供給するにはどうすればよいですか？

答えて

関連する問題