検出のためのより効率的な読み込み方法

テンソルフローオブジェクト検出apiを使用して、セミリアルタイムのオブジェクト検出タスクを実行しています。画像は2枚/秒の速度でカメラで撮影されます。エッチイメージは4つの小さなイメージに切り取られますので、合計で8イメージ/秒を処理する必要があります。検出のためのより効率的な読み込み方法

私の検出モデルはフリーズグラフ（.pbファイル）にエクスポートされ、GPUメモリにロードされています。次に、私はそれらを私のモデルに供給するためにnumpy配列に画像を読み込みます。

検出自体には約0.1秒かかりますが、各画像の読み込みには約0.45秒かかります。

私が使用しているスクリプトは、オブジェクト検出api（link）によって提供されたコードサンプルから改訂され、各画像を読み込み、numpy配列に変換して検出モデルに入力します。このプロセスで最も多く消費される時間はload_image_into_numpy_arrayです。ほとんどの時間は0.45秒です。

スクリプトは以下である：私はカメラによって生成された画像をロードするためのより効率的な方法を考えています

import numpy as np 
import os 
import six.moves.urllib as urllib 
import sys 
import tarfile 
import tensorflow as tf 
import zipfile 
import timeit 
import scipy.misc 


from collections import defaultdict 
from io import StringIO 
from matplotlib import pyplot as plt 
from PIL import Image 


from utils import label_map_util 

from utils import visualization_utils as vis_util 

# Path to frozen detection graph. This is the actual model that is used for the 
# object detection. 
PATH_TO_CKPT = 'animal_detection.pb' 

# List of the strings that is used to add correct label for each box. 
PATH_TO_LABELS = os.path.join('data', 'animal_label_map.pbtxt') 

NUM_CLASSES = 1 


detection_graph = tf.Graph() 
with detection_graph.as_default(): 
    od_graph_def = tf.GraphDef() 
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: 
     serialized_graph = fid.read() 
     od_graph_def.ParseFromString(serialized_graph) 
     tf.import_graph_def(od_graph_def,name='') 

label_map = label_map_util.load_labelmap(PATH_TO_LABELS) 
categories = label_map_util.convert_label_map_to_categories(label_map, 
                  max_num_classes=NUM_CLASSES, 
                  use_display_name=True) 
category_index = label_map_util.create_category_index(categories) 

def load_image_into_numpy_array(image): 
    (im_width, im_height) = image.size 
    return np.array(image.getdata()).reshape(
     (im_height, im_width, 3)).astype(np.uint8) 

# For the sake of simplicity we will use only 2 images: 
    # image1.jpg 
    # image2.jpg 
    # If you want to test the code with your images, just add path to the 
    # images to the TEST_IMAGE_PATHS. 
PATH_TO_TEST_IMAGES_DIR = 'test' 
TEST_IMAGE_PATHS = [ 
    os.path.join(PATH_TO_TEST_IMAGES_DIR,'image{}.png'.format(i)) for i in range(1, 10) ] 

    # Size, in inches, of the output images. 
IMAGE_SIZE = (12, 8) 
config = tf.ConfigProto() 
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1 
with detection_graph.as_default(): 
    with tf.Session(graph=detection_graph, config=config) as sess: 
    for image_path in TEST_IMAGE_PATHS: 
     start = timeit.default_timer() 
     image = Image.open(image_path) 
     # the array based representation of the image will be used later in order to prepare the 
     # result image with boxes and labels on it. 
     image_np = load_image_into_numpy_array(image) 
     # Expand dimensions since the model expects images to have shape: [1, None, None, 3] 
     image_np_expanded = np.expand_dims(image_np, axis=0) 
     image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') 
     end = timeit.default_timer() 
     print(end-start) 
     start = timeit.default_timer() 
     # Each box represents a part of the image where a particular object was detected. 
     boxes = detection_graph.get_tensor_by_name('detection_boxes:0') 
     # Each score represent how level of confidence for each of the objects. 
     # Score is shown on the result image, together with the class label. 
     scores = detection_graph.get_tensor_by_name('detection_scores:0') 
     classes = detection_graph.get_tensor_by_name('detection_classes:0') 
     num_detections = detection_graph.get_tensor_by_name('num_detections:0') 
     # Actual detection. 
     (boxes, scores, classes, num_detections) = sess.run(
      [boxes, scores, classes, num_detections], 
      feed_dict={image_tensor: image_np_expanded}) 
     stop = timeit.default_timer() 
     print (stop - start) 
     # Visualization of the results of a detection. 
    vis_util.visualize_boxes_and_labels_on_image_array(
     image_np, 
     np.squeeze(boxes), 
     np.squeeze(classes).astype(np.int32), 
     np.squeeze(scores), 
     category_index, 
     use_normalized_coordinates=True, 
     line_thickness=2)

、最初に考えたのはnumpyの配列を回避しにネイティブな方法をtensorflow使用しようとすることです画像を読み込むことができますが、私はテンソルフローが非常に新しいので、どこから始めるべきか分かりません。

イメージを読み込むためのテンソルフローの方法が見つかった場合は、4つのイメージを1つのバッチに入れてモデルにフィードして、速度を改善できるかもしれません。

未熟なアイデアはtf_recordファイルに1枚の生の画像からトリミングし、モデルを養うために1つのバッチとしてtf_recordファイルをロードするが、私はそれを達成するためにどのようには考えている4枚の小さな画像を保存しようとしています。

ご協力いただければ幸いです。

出典

2017-08-17 Xinyao Wang

イメージローディングを0.4秒から0.01秒に短縮できる1つの解決策が見つかりました。誰かが同じ問題を抱えている場合に備えて、私はここに回答を掲示します。 PIL.Imageとnumpyを使用する代わりに、opencvでimreadを使用できます。さらに高速化を実現するため、イメージをバッチ処理しました。

import numpy as np 
import os 
import six.moves.urllib as urllib 
import sys 
import tensorflow as tf 
import timeit 
import cv2 


from collections import defaultdict 

from utils import label_map_util 

from utils import visualization_utils as vis_util 

MODEL_PATH = sys.argv[1] 
IMAGE_PATH = sys.argv[2] 
BATCH_SIZE = int(sys.argv[3]) 
# Path to frozen detection graph. This is the actual model that is used for the 
# object detection. 
PATH_TO_CKPT = os.path.join(MODEL_PATH, 'frozen_inference_graph.pb') 

# List of the strings that is used to add correct label for each box. 
PATH_TO_LABELS = os.path.join('data', 'animal_label_map.pbtxt') 

NUM_CLASSES = 1 

detection_graph = tf.Graph() 
with detection_graph.as_default(): 
    od_graph_def = tf.GraphDef() 
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: 
     serialized_graph = fid.read() 
     od_graph_def.ParseFromString(serialized_graph) 
     tf.import_graph_def(od_graph_def,name='') 

label_map = label_map_util.load_labelmap(PATH_TO_LABELS) 
categories = label_map_util.convert_label_map_to_categories(label_map, 
                  max_num_classes=NUM_CLASSES, 
                  use_display_name=True) 
category_index = label_map_util.create_category_index(categories) 

PATH_TO_TEST_IMAGES_DIR = IMAGE_PATH 
TEST_IMAGE_PATHS = [ 
    os.path.join(PATH_TO_TEST_IMAGES_DIR,'image{}.png'.format(i)) for i in range(1, 129) ] 

config = tf.ConfigProto() 
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1 
with detection_graph.as_default(): 
    with tf.Session(graph=detection_graph, config=config) as sess: 
    for i in range(0, len(TEST_IMAGE_PATHS), BATCH_SIZE): 
     images = [] 
     start = timeit.default_timer() 
     for j in range(0, BATCH_SIZE): 
      image = cv2.imread(TEST_IMAGE_PATHS[i+j]) 
      image = np.expand_dims(image, axis=0) 
      images.append(image) 
      image_np_expanded = np.concatenate(images, axis=0) 
     image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') 
     # Each box represents a part of the image where a particular object was detected. 
     boxes = detection_graph.get_tensor_by_name('detection_boxes:0') 
     # Each score represent how level of confidence for each of the objects. 
     # Score is shown on the result image, together with the class label. 
     scores = detection_graph.get_tensor_by_name('detection_scores:0') 
     classes = detection_graph.get_tensor_by_name('detection_classes:0') 
     num_detections = detection_graph.get_tensor_by_name('num_detections:0') 
     # Actual detection. 
     (boxes, scores, classes, num_detections) = sess.run(
      [boxes, scores, classes, num_detections], 
      feed_dict={image_tensor: image_np_expanded}) 
     stop = timeit.default_timer() 
     print (stop - start)

：

スクリプトは次のようになります

出典

2017-08-18 00:50:20

検出のためのより効率的な読み込み方法

答えて

関連する問題