object_detectionテンソルフローAPIに固有の画像サイズを指定しますか？

テンソルフローモデルリポジトリのobject_detectionを使用します。object_detectionテンソルフローAPIに固有の画像サイズを指定しますか？

私は非常に特定の画像で自分のデータセットを訓練したいと思います。私が持っているイメージは、特定のサイズを持っていないし、多くの異なる。

私が手にエラーがある：

InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,1446,1024,3] vs. shape[1] = [1,1449,1024,3] 
    [[Node: concat_1 = ConcatV2[N=8, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](Preprocessor/sub, Preprocessor_1/sub, Preprocessor_2/sub, Preprocessor_3/sub, Preprocessor_4/sub, Preprocessor_5/sub, Preprocessor_6/sub, Preprocessor_7/sub, concat_1/axis)]] 
    [[Node: MultiClassNonMaxSuppression_1/Equal/_3597 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_17245_MultiClassNonMaxSuppression_1/Equal", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

完了出力がpastebinで見つけることができます。

以下は私が使用した設定です。

# Faster R-CNN with Resnet-50 (v1), configured for Oxford-IIT Pets Dataset. 
# Users should configure the fine_tune_checkpoint field in the train config as 
# well as the label_map_path and input_path fields in the train_input_reader and 
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that 
# should be configured. 

model { 
    faster_rcnn { 
    num_classes: 16 
    image_resizer { 
     keep_aspect_ratio_resizer { 
     min_dimension: 600 
     max_dimension: 1024 
     } 
    } 
    feature_extractor { 
     type: 'faster_rcnn_resnet50' 
     first_stage_features_stride: 16 
    } 
    first_stage_anchor_generator { 
     grid_anchor_generator { 
     scales: [0.25, 0.5, 1.0, 2.0] 
     aspect_ratios: [0.5, 1.0, 2.0] 
     height_stride: 16 
     width_stride: 16 
     } 
    } 
    first_stage_box_predictor_conv_hyperparams { 
     op: CONV 
     regularizer { 
     l2_regularizer { 
      weight: 0.0 
     } 
     } 
     initializer { 
     truncated_normal_initializer { 
      stddev: 0.01 
     } 
     } 
    } 
    first_stage_nms_score_threshold: 0.0 
    first_stage_nms_iou_threshold: 0.7 
    first_stage_max_proposals: 300 
    first_stage_localization_loss_weight: 2.0 
    first_stage_objectness_loss_weight: 1.0 
    initial_crop_size: 14 
    maxpool_kernel_size: 2 
    maxpool_stride: 2 
    second_stage_box_predictor { 
     mask_rcnn_box_predictor { 
     use_dropout: false 
     dropout_keep_probability: 1.0 
     fc_hyperparams { 
      op: FC 
      regularizer { 
      l2_regularizer { 
       weight: 0.0 
      } 
      } 
      initializer { 
      variance_scaling_initializer { 
       factor: 1.0 
       uniform: true 
       mode: FAN_AVG 
      } 
      } 
     } 
     } 
    } 
    second_stage_post_processing { 
     batch_non_max_suppression { 
     score_threshold: 0.0 
     iou_threshold: 0.6 
     max_detections_per_class: 100 
     max_total_detections: 300 
     } 
     score_converter: SOFTMAX 
    } 
    second_stage_localization_loss_weight: 2.0 
    second_stage_classification_loss_weight: 1.0 
    } 
} 

train_config: { 
    batch_size: 8 
    optimizer { 
    momentum_optimizer: { 
     learning_rate: { 
     manual_step_learning_rate { 
      initial_learning_rate: 0.0003 
      schedule { 
      step: 0 
      learning_rate: .0003 
      } 
      schedule { 
      step: 900000 
      learning_rate: .00003 
      } 
      schedule { 
      step: 1200000 
      learning_rate: .000003 
      } 
     } 
     } 
     momentum_optimizer_value: 0.9 
    } 
    use_moving_average: false 
    } 
    gradient_clipping_by_norm: 10.0 
    data_augmentation_options { 
    random_horizontal_flip { 
    } 
    } 
} 

train_input_reader: { 
    tf_record_input_reader { 
    input_path: "train.record" 
    } 
    label_map_path: "label_map.pbtxt" 
} 

eval_config: { 
    num_examples: 200 
} 

eval_input_reader: { 
    tf_record_input_reader { 
    input_path: "val.record" 
    } 
    label_map_path: "label_map.pbtxt" 
}

質問1：検出APIを行い、入力画像の特定の寸法が必要？

質問2：このエラーが表示される理由は何ですか？エラーを修正するにはどうしたらよいですか、またはどこから開始する必要がありますか？

私がすでに試みたのは、すべての画像に1024pxと500pxの幅を与えることでした。

私が撮ったの手順：

を私はcreate_record.py fileを作成し、train.recordとval.recordファイルを作成しました。
私はtrain.pyを実行しましたが、上記のエラーで失敗しました。

私は、1つのNvidia GPUを搭載したUbuntu 16.04でpython 3.5.2を使用します。

出典

2017-07-12 ArjanSchouten

私はこの問題は、テンソルのサイズは、各画像ごとに異なることがある1

にBATCH_SIZEを変更することによって問題を解決しました。同じサイズの画像がある場合は、batch_sizeを高く設定できます。それはあなたが1

にBATCH_SIZEを設定する必要がケースではありませんので、だから、答えはAPIは限りBATCH_SIZEは1

であるように、異なる寸法を扱うことができるということです

出典

2017-07-12 14:11:33 ArjanSchouten

object_detectionテンソルフローAPIに固有の画像サイズを指定しますか？

答えて

関連する問題