3
数千歩のim2txt列車は、次のエラーで停止します。 トレーニングファイルを確認しても問題はありません。トレーニングテンソルフローim2txtが、切り捨てられたレコードで失敗します。
Ubuntu 16.04、TF r.0.11、GPUモードGTX 970 4Gbで動作します。
RAMが不足しているかどうかはわかりませんか?
INFO:tensorflow:global step 56396: loss = 2.4654 (0.41 sec/step)
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors.DataLossError'>, truncated record at 369740238
[[Node: ReaderRead = ReaderRead[_class=["loc:@TFRecordReader", "loc:@filename_queue"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, filename_queue)]]
Caused by op u'ReaderRead', defined at:
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 114, in <module>
tf.app.run()
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 65, in main
model.build()
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 352, in build
self.build_inputs()
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 153, in build_inputs
num_reader_threads=self.config.num_input_reader_threads)
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/ops/inputs.py", line 115, in prefetch_input_data
_, value = reader.read(filename_queue)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 277, in read
return gen_io_ops._reader_read(self._reader_ref, queue_ref, name=name)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 211, in _reader_read
queue_handle=queue_handle, name=name)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in apply_op
op_def=op_def)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2403, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1305, in __init__
self._traceback = _extract_stack()
DataLossError (see above for traceback): truncated record at 369740238
[[Node: ReaderRead = ReaderRead[_class=["loc:@TFRecordReader", "loc:@filename_queue"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, filename_queue)]]
INFO:tensorflow:global step 56397: loss = 2.5540 (0.40 sec/step)