2017-12-16 15 views
0

以下の関数を使用して、大量の画像を100,000回作図しようとしています。私はこの操作を連続的にやっていますが、時間がかかります。これを行う効率的な方法は何ですか?以下はテンソルフローで画像の切り抜きを使用する方法

tf.image.crop_to_bounding_box 

私のコードです:

def crop_images(img_dir, list_images): 
     outlist=[] 
     with tf.Session() as session: 
      for image1 in list_images[:5]: 
       image = mpimg.imread(img_dir+image1) 
       x = tf.Variable(image, name='x') 
       data_t = tf.placeholder(tf.uint8) 
       op = tf.image.encode_jpeg(data_t, format='rgb') 
       model = tf.global_variables_initializer() 
       img_name = "img/"+image1.split("_img_0")[0] + "/img_0"+image1.split("_img_0")[1] 
       height = x.shape[1] 
       [x1,y1,x2,y2] = img_bbox_dict[img_name] 
       x = tf.image.crop_to_bounding_box(x, int(y1), int(x1), int(y2)-int(y1), int(x2)-int(x1)) 
       session.run(model) 
       result = session.run(x) 
       data_np = session.run(op, feed_dict={ data_t: result }) 
       with open(img_path+image1, 'w+') as fd: 
       fd.write(data_np) 
+0

なぜテンソルフローで行いたいのですか? – Lescurel

+0

** GNU Parallel **と 'libvips'を使ったほうが速いでしょうが、どのプラットフォームを使っているのか、どのようなイメージがあるのか​​、どのようにトリミングする必要があるのか​​はわかりません... –

+0

多分キューと労働者... –

答えて

1

私はhere見つけることができるデータの読み込みにTensorflowのプログラマーズ・ガイドからの例の一つの簡易版を与えるでしょう。基本的には、ReaderおよびFilename Queuesを使用して、指定された数のスレッドを使用してイメージデータをまとめてバッチ処理します。これらのスレッドは、スレッドコーディネータと呼ばれるものを使用して調整されます。

import tensorflow as tf 
import glob 

images_path = "./" #RELATIVE glob pathname of current directory 
images_extension = "*.png" 

# Save the list of files matching pattern, so it is only computed once. 
filenames = tf.train.match_filenames_once(glob.glob(images_path+images_extension)) 
batch_size = len(glob.glob1(images_path,images_extension)) 

num_epochs=1 
standard_size = [500, 500] 
num_channels = 3 

min_after_dequeue = 10 
num_preprocess_threads = 3 
seed = 14131 

""" 
IMPORTANT: Cropping params. These are arbitrary values used only for this example. 
You will have to change them according to your requirements. 
""" 
crop_size=[200,200] 
boxes = [1,1,460,460] 


""" 
'WholeFileReader' is a Reader who's 'read' method outputs the next 
key-value pair of the filename and the contents of the file (the image) from 
the Queue, both of which are string scalar Tensors. 

Note that the The QueueRunner works in a thread separate from the 
Reader that pulls filenames from the queue, so the shuffling and enqueuing 
process does not block the reader. 

'resize_images' is used so that all images are resized to the same 
size (Aspect ratios may change, so in that case use resize_image_with_crop_or_pad) 

'set_shape' is used because the height and width dimensions of 'image' are 
data dependent and cannot be computed without executing this operation. Without 
this Op, the 'image' Tensor's shape will have None as Dimensions. 
""" 
def read_my_file_format(filename_queue, standard_size, num_channels): 
    image_reader = tf.WholeFileReader() 
    _, image_file = image_reader.read(filename_queue) 

    if "jpg" in images_extension: 
     image = tf.image.decode_jpeg(image_file) 
    elif "png" in images_extension: 
     image = tf.image.decode_png(image_file) 

    image = tf.image.resize_images(image, standard_size) 
    image.set_shape(standard_size+[num_channels]) 
    print "Successfully read file!" 
    return image 



""" 
'string_input_producer' Enters matched filenames into a 'QueueRunner' FIFO Queue. 

'shuffle_batch' creates batches by randomly shuffling tensors. The 'capacity' 
argument controls the how long the prefetching is allowed to grow the queues. 
'min_after_dequeue' defines how big a buffer we will randomly 
sample from -- bigger means better shuffling but slower startup & more memory used. 
'capacity' must be larger than 'min_after_dequeue' and the amount larger 
determines the maximum we will prefetch. 
Recommendation: min_after_dequeue + (num_threads + a small safety margin) * batch_size 
""" 
def input_pipeline(filenames, batch_size, num_epochs, standard_size, num_channels, min_after_dequeue, num_preprocess_threads, seed): 
    filename_queue = tf.train.string_input_producer(filenames, num_epochs=num_epochs, shuffle=True) 
    example = read_my_file_format(filename_queue, standard_size, num_channels) 
    capacity = min_after_dequeue + 3 * batch_size 
    example_batch = tf.train.shuffle_batch([example], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue, num_threads=num_preprocess_threads, seed=seed, enqueue_many=False) 
    print "Batching Successful!" 
    return example_batch 


""" 
Any transformation on the image batch goes here. Refer the documentation 
for the details of how the cropping is done using this function. 
""" 
def crop_batch(image_batch, batch_size, b_boxes, crop_size): 
    cropped_images = tf.image.crop_and_resize(image_batch, boxes=[b_boxes for _ in xrange(batch_size)], box_ind=[i for i in xrange(batch_size)], crop_size=crop_size) 
    print "Cropping Successful!" 
    return cropped_images 


example_batch = input_pipeline(filenames, batch_size, num_epochs, standard_size, num_channels, min_after_dequeue, num_preprocess_threads, seed) 
cropped_images = crop_batch(example_batch, batch_size, boxes, crop_size) 



""" 
if 'num_epochs' is not `None`, the 'string_input_producer' function creates local 
counter `epochs`. Use `local_variables_initializer()` to initialize local variables. 

'Coordinator' class implements a simple mechanism to coordinate the termination 
of a set of threads. Any of the threads can call `coord.request_stop()` to ask for all 
the threads to stop. To cooperate with the requests, each thread must check for 
`coord.should_stop()` on a regular basis. 
`coord.should_stop()` returns True` as soon as `coord.request_stop()` has been called. 
A thread can report an exception to the coordinator as part of the `should_stop()` 
call. The exception will be re-raised from the `coord.join()` call. 

After a thread has called `coord.request_stop()` the other threads have a 
fixed time to stop, this is called the 'stop grace period' and defaults to 2 minutes. 
If any of the threads is still alive after the grace period expires `coord.join()` 
raises a RuntimeError reporting the laggards. 


IMPORTANT: 'start_queue_runners' starts threads for all queue runners collected in 
the graph, & returns the list of all threads. This must be executed BEFORE running 
any other training/inference/operation steps, or it will hang forever. 
""" 
with tf.Session() as sess: 
    _, _ = sess.run([tf.global_variables_initializer(), tf.local_variables_initializer()]) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(sess=sess, coord=coord) 

    try: 
     while not coord.should_stop(): 
      # Run training steps or whatever 
      cropped_images1 = sess.run(cropped_images) 
      print cropped_images1.shape 

    except tf.errors.OutOfRangeError: 
     print('Load and Process done -- epoch limit reached') 
    finally: 
     # When done, ask the threads to stop. 
     coord.request_stop() 

coord.join(threads) 
sess.close() 
関連する問題