tf.python_io.tf_record_iteratorでエポックの数値を設定する方法

私はデータセットを何度も繰り返し処理しようとしていました。私はtf.python_io.tf_record_iteratorを使用しました。しかし、次のように私はそれを使用しました。そこでtf.python_io.tf_record_iteratorでエポックの数値を設定する方法

record_iterator = tf.python_io.tf_record_iterator(path=tfrecords_filename) 
for z in range(4): 
    for k, string_record in enumerate(record_iterator): 
    ....

を、外側のループは効果がありませんし、反復は、内側ループは、データセットを反復処理行われた直後に終了しました。

ご迷惑をおかけして申し訳ありません。

出典

2017-06-02 I. A

最後に、新しいテンソルフローDataset APIによってこの機能がエンコードされました。完全なドキュメントはhttps://www.tensorflow.org/api_docs/python/tf/contrib/data/Datasetにあります。

この新しいAPIを使用すると、forループを使用してデータベースを複数回繰り返したり、Datasetクラスのrepeat()を使用してエンドユーザーが複数回繰り返したりできます。

import tensorflow as tf 
import numpy as np 
import time 
import cv2 

num_epoch = 2 
batch_size = 8 # This is set to 8 since 
num_threads = 9 
common = "C:/Users/user/PycharmProjects/AffectiveComputingNew/database/" 
filenames = [(common + "train_1_db.tfrecords"), (common + "train_2_db.tfrecords"), (common + "train_3_db.tfrecords"), 
    (common + "train_4_db.tfrecords"), (common + "train_5_db.tfrecords"), (common + "train_6_db.tfrecords"), 
    (common + "train_7_db.tfrecords"), (common + "train_8_db.tfrecords"), (common + "train_9_db.tfrecords")] 

# Transforms a scalar string `example_proto` into a pair of a scalar string and 
# a scalar integer, representing an image and its label, respectively. 
def _parse_function(example_proto): 
    features = { 
     'height': tf.FixedLenFeature([], tf.int64), 
     'width': tf.FixedLenFeature([], tf.int64), 
     'image_raw': tf.FixedLenFeature([], tf.string), 
     'features': tf.FixedLenFeature([432], tf.float32) 
    } 

    parsed_features = tf.parse_single_example(example_proto, features) 

    # This is how we create one example, that is, extract one example from the database. 
    image = tf.decode_raw(parsed_features['image_raw'], tf.uint8) 
    # The height and the weights are used to 
    height = tf.cast(parsed_features['height'], tf.int32) 
    width = tf.cast(parsed_features['width'], tf.int32) 

    # The image is reshaped since when stored as a binary format, it is flattened. Therefore, we need the 
    # height and the weight to restore the original image back. 
    image = tf.reshape(image, [height, width, 3]) 

    features = parsed_features['features'] 

    return features, image 

random_features = tf.Variable(tf.zeros([72, 432], tf.float32)) 
random_images = tf.Variable(tf.zeros([72, 112, 112, 3], tf.uint8)) 

datasets = [] 
for _ in filenames: 
    datasets.append(tf.contrib.data.TFRecordDataset(_).map(_parse_function)) 

dataset_ziped = tf.contrib.data.TFRecordDataset.zip((datasets[0], datasets[1], datasets[2], datasets[3], 
     datasets[4], datasets[5], datasets[6], datasets[7], datasets[8])) 
dataset = dataset_ziped.batch(batch_size) 

iterator = dataset.make_initializable_iterator() 
next_batch = iterator.get_next() # This has shape: [9, 2] 

features = tf.concat((next_batch[0][0], next_batch[1][0], next_batch[2][0], next_batch[3][0], 
         next_batch[4][0], next_batch[5][0], next_batch[6][0], next_batch[7][0], 
         next_batch[8][0]), axis=0) 
images = tf.concat((next_batch[0][1], next_batch[1][1], next_batch[2][1], next_batch[3][1], 
        next_batch[4][1], next_batch[5][1], next_batch[6][1], next_batch[7][1], 
        next_batch[8][1]), axis=0) 

def get_features(features, images): 
    with tf.control_dependencies([tf.assign(random_features, features), tf.assign(random_images, images)]): 
     features = tf.reshape(features, shape=[9, 8, 432]) # where 8 * 9 = 72 
     features = tf.transpose(features, perm=[1, 0, 2]) # shape becomes: [8, 9, 432] 
     features = tf.reshape(features, shape=[72, 432]) # Now frames will be: 1st frame from 1st video, second from second video... 

     images = tf.reshape(images, shape=[9, 8, 112, 112, 3]) 
     images = tf.transpose(images, perm=[1, 0, 2, 3, 4]) 
     images = tf.reshape(images, shape=[72, 112, 112, 3]) 
     return features, images 

condition1 = tf.equal(tf.shape(features)[0], batch_size * 9) 
condition2 = tf.equal(tf.shape(images)[0], batch_size * 9) 

condition = tf.logical_and(condition1, condition2) 

features, images = tf.cond(condition, 
          lambda: get_features(features, images), 
          lambda: get_features(random_features, random_images)) 

init_op = tf.global_variables_initializer() 

with tf.Session() as sess: 
    # Initialize `iterator` with training data. 
    sess.run(init_op) 

    for _ in range(num_epoch): 
     sess.run(iterator.initializer) 

     # This while loop will run indefinitly until the end of the first epoch 
     while True: 
      try: 
       lst = [] 
       features_np, images_np = sess.run([features, images]) 

       for f in features_np: 
        lst.append(f[0]) 

       print(lst) 
      except tf.errors.OutOfRangeError: 
       print('errorrrrr') 
       break

一つのこと、最後に取得が切り捨てられる可能性があり、これが問題につながるので、（私は上のサイズ変更操作をしています注意してください：ここで

は、私は、このAPIを使用している方法についての完全なコードですバッチサイズがmy（batch_size * 9）に等しいときはいつでもバッチと同じになるテンポラリvariableを使用しました。これは "これは今のところ重要ではありません"です。

出典

2017-09-22 22:12:01

tf.python_io.tf_record_iteratorでエポックの数値を設定する方法

答えて

関連する問題