Hi, datitran @datitran
I follow your step and want to train on local machine.
My main setup is below
fine_tune_checkpoint: "F:\GitHub\ssd_mobilenet_v1_coco_11_06_2017\model.ckpt"
from_detection_checkpoint: true
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "F:\GitHub\raccoon_dataset-master\data\train.record"
}
label_map_path: "F:\GitHub\raccoon_dataset-master\training\object-detection.pbtxt"
}
eval_config: {
num_examples: 40
}
eval_input_reader: {
tf_record_input_reader {
input_path: "F:\GitHub\raccoon_dataset-master\data\test.record"
}
label_map_path: "F:\GitHub\raccoon_dataset-master\training\object-detection.pbtxt"
shuffle: false
num_readers: 1
}
My computer environment is win7,GXT1060, 8G memory.
The issue is below, it always have a error "INFO:tensorflow:Caught OutOfRangeError. Stopping Training."
Do you know why this happen?
Thank you very much
F:\models-master>python object_detection/train.py --logtostderr --pipeline_config_path=F:\GitHub\raccoon_dataset-master\training\ssd_m
obilenet_v1_pets.config --train_dir=F:\train_dir
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
WARNING:tensorflow:From F:\GitHub\models-master\object_detection\meta_architectures\ssd_meta_arch.py:607: all_variables (from tensorflo
w.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-09-02 09:50:37.901800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU co
mputations.
2017-09-02 09:50:37.901800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU c
omputations.
2017-09-02 09:50:37.901800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU c
omputations.
2017-09-02 09:50:37.901800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU
computations.
2017-09-02 09:50:37.902800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU
computations.
2017-09-02 09:50:37.902800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU co
mputations.
2017-09-02 09:50:37.902800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU c
omputations.
2017-09-02 09:50:37.903800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.c
c:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU co
mputations.
2017-09-02 09:50:38.065800: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_devic
e.cc:940] Found device 0 with properties:
name: GeForce GTX 1060 6GB
major: 6 minor: 1 memoryClockRate (GHz) 1.7085
pciBusID 0000:01:00.0
Total memory: 6.00GiB
Free memory: 5.55GiB
2017-09-02 09:50:38.067800: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_devic
e.cc:961] DMA: 0
2017-09-02 09:50:38.068800: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_devic
e.cc:971] 0: Y
2017-09-02 09:50:38.068800: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_devic
e.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0)
2017-09-02 09:50:44.297800: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\simple_placer
.cc:675] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is
a reference connection and already has a device field set to /device:CPU:0
INFO:tensorflow:Restoring parameters from F:\GitHub\ssd_mobilenet_v1_coco_11_06_2017\model.ckpt
INFO:tensorflow:Starting Session.
[[Node: prefetch_queue_Dequeue = QueueDequeueV2[component_types=[DT_INT32, DT_STRING, DT_INT32, DT_FLOAT, DT_BOOL, DT_FLOAT, D
T_INT32, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_STRING, DT_INT64, DT_INT64, DT_STRING, DT_INT64, DT_BOOL, DT_INT32, DT_INT32, DT_IN
T32, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](prefetch_queue)]]
2017-09-02 09:48:05.589800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\framework\op_kernel.cc:1158]
Out of range: FIFOQueue '_6_prefetch_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: prefetch_queue_Dequeue = QueueDequeueV2component_types=[DT_INT32, DT_STRING, DT_INT32, DT_FLOAT, DT_BOOL, DT_FLOAT, D
T_INT32, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_STRING, DT_INT64, DT_INT64, DT_STRING, DT_INT64, DT_BOOL, DT_INT32, DT_INT32, DT_IN
T32, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
2017-09-02 09:48:05.588800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\framework\op_kernel.cc:1158]
Out of range: FIFOQueue '_6_prefetch_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: prefetch_queue_Dequeue = QueueDequeueV2component_types=[DT_INT32, DT_STRING, DT_INT32, DT_FLOAT, DT_BOOL, DT_FLOAT, D
T_INT32, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_STRING, DT_INT64, DT_INT64, DT_STRING, DT_INT64, DT_BOOL, DT_INT32, DT_INT32, DT_IN
T32, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
2017-09-02 09:48:05.590800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\framework\op_kernel.cc:1158]
Out of range: FIFOQueue '_6_prefetch_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: prefetch_queue_Dequeue = QueueDequeueV2component_types=[DT_INT32, DT_STRING, DT_INT32, DT_FLOAT, DT_BOOL, DT_FLOAT, D
T_INT32, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_STRING, DT_INT64, DT_INT64, DT_STRING, DT_INT64, DT_BOOL, DT_INT32, DT_INT32, DT_IN
T32, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
2017-09-02 09:48:05.587800: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\framework\op_kernel.cc:1158]
Out of range: FIFOQueue '_6_prefetch_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: prefetch_queue_Dequeue = QueueDequeueV2component_types=[DT_INT32, DT_STRING, DT_INT32, DT_FLOAT, DT_BOOL, DT_FLOAT, D
T_INT32, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_STRING, DT_INT64, DT_INT64, DT_STRING, DT_INT64, DT_BOOL, DT_INT32, DT_INT32, DT_IN
T32, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
INFO:tensorflow:Caught OutOfRangeError. Stopping Training.
INFO:tensorflow:Finished training! Saving model to disk.
Traceback (most recent call last):
File "object_detection/train.py", line 198, in
tf.app.run()
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "F:\GitHub\models-master\object_detection\trainer.py", line 296, in train
saver=saver)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 759, in train
sv.saver.save(sess, sv.save_path, global_step=sv.global_step)
File "C:\Program Files\Anaconda3\lib\contextlib.py", line 66, in exit
next(self.gen)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "C:\Program Files\Anaconda3\lib\site-packages\six.py", line 686, in reraise
raise value
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\queue_runner_impl.py", line 238, in _run
enqueue_callable()
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1063, in _single_operation_run
target_list_as_strings, status, None)
File "C:\Program Files\Anaconda3\lib\contextlib.py", line 66, in exit
next(self.gen)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
accoon_dataset-master\data rain.record : \udcce\u013c\udcfe\udcc3\udcfb\udca1\udca2\u013f\xbc\udcc3\udcfb\udcbb\udcf2\udcbe\udced\udcb1\udcea\udcd3\ufde8\udcb2\udcbb\udcd5\udcfd\u0237\udca1\ud
ca3
[[Node: parallel_read/ReaderReadV2_1 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](parallel_read/TFRecordReaderV2_1, parallel_read/filenames)]]