テンソルフローの実行には小さなデータが永久にかかります

私は892行の列車データサンプルを持つ本の簡単な例を試しています。これは通常のタイタン生存テキストブックの例です。テンソルフローの実行には小さなデータが永久にかかります

def read_csv(batch_size, file_path, record_defaults): 

    filename_queue = tf.train.string_input_producer([file_path]) 
    reader = tf.TextLineReader(skip_header_lines=1) 
    key, value = reader.read(filename_queue) 

    # decode_csv will convert a Tensor from type string (the text line) in 
    # a tuple of tensor columns with the specified defaults, which also 
    # sets the data type for each column 
    decoded = tf.decode_csv(value, record_defaults=record_defaults) 


    # batch actually reads the file and loads "batch_size" rows in a single tensor 
    return tf.train.shuffle_batch(decoded, 
            batch_size=batch_size, 
            capacity=batch_size * 50, 
            min_after_dequeue=batch_size) 
def inputs(): 
    passenger_id, survived, pclass, name, sex, age, sibsp, parch, ticket, fare, cabin, embarked = \ 
     read_csv(BATCH_SIZE, file_path, record_defaults) 

    # convert categorical data 
    is_first_class = tf.to_float(tf.equal(pclass, [1])) 
    is_second_class = tf.to_float(tf.equal(pclass, [2])) 
    is_third_class = tf.to_float(tf.equal(pclass, [3])) 

    gender = tf.to_float(tf.equal(sex, ["female"])) 

    # Finally we pack all the features in a single matrix; 
    # We then transpose to have a matrix with one example per row and one feature per column. 
    features = tf.transpose(tf.pack([is_first_class, is_second_class, is_third_class, gender, age])) 


    print 'shape of features', features.get_shape() 
    return features, survived

をそして今、私がやろう：私は定義

graph = tf.Graph() 

with tf.Session(graph=graph) as sess: 

    W = tf.Variable(tf.zeros([5, 1]), name="weights") 
    b = tf.Variable(0., name="bias") 
    tf.global_variables_initializer().run() 
    print 'tf was run' 
    X,Y = inputs() 
    print 'inputs!' 
    sess.run(Y)

と私は

'tf was run!' 
'inputs!'

を参照してくださいが、run部分は永久にハングアップ（または少なくとも非常に長いTIEM）。私はJupyterで2.7カーネルとtfバージョンを実行しています0.12

私は何が欠けていますか？ラインで

出典

2017-01-12 elelias

return tf.train.shuffle_batch(decoded, 
           batch_size=batch_size, 
           capacity=batch_size * 50, 
           min_after_dequeue=batch_size)

あなたがキューとバッチの作成からの値の抽出の操作を定義するです。

メソッドの完全なシグネチャを見ると、いくつかのスレッドを参照するパラメータがあることに気付くでしょう。

tf.train.shuffle_batch( tensors, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, allow_smaller_final_batch=False, shared_name=None, name=None)

定義した操作がいくつかのスレッドから実行されるため、これを指摘しています。スレッドはを開始して停止する必要がありますが、この機能はこれを行いません。この関数が行うスレッド処理に関する唯一のことは、num_threadをキューに追加することです。

graph = tf.Graph() with tf.Session(graph=graph) as sess: W = tf.Variable(tf.zeros([5, 1]), name="weights") b = tf.Variable(0., name="bias") tf.global_variables_initializer().run() print 'tf was run' X,Y = inputs() # define a coordinator to start and stop the threads coord = tf.train.Coordinator() # wake up the threads threads = tf.train.start_queue_runners(sess=sess, coord=coord) print 'inputs!' sess.run(Y) #execute operation # When done, ask the threads to stop. coord.request_stop() # Wait for threads to finish. coord.join(threads)

出典

2017-01-12 14:58:05 nessuno

かの世話をする 'tf.train.MonitoredTrainingSession'を使用する：あなたは、セッション内でキュー内のスレッドを覚ます操作を定義する必要がスレッドを開始し、停止する。実際には

、すべてのキューランナーの初期化と開始。 – Nandeesh

テンソルフローの実行には小さなデータが永久にかかります

答えて

関連する問題