GithubHelp home page GithubHelp logo

deeplabcut / deeplabcut Goto Github PK

View Code? Open in Web Editor NEW
4.4K 126.0 1.6K 184.14 MB

Official implementation of DeepLabCut: Markerless pose estimation of user-defined features with deep learning for all animals incl. humans

Home Page: http://deeplabcut.org

License: GNU Lesser General Public License v3.0

Python 99.23% Shell 0.75% Makefile 0.02%
behavior-analysis deep-learning pose-estimation feature-detectors toolbox deeplabcut animal-pose-estimation labeling-tool keypoint-tracking keypoint-detection

deeplabcut's Introduction

Welcome! πŸ‘‹

DeepLabCutℒ️ is a toolbox for state-of-the-art markerless pose estimation of animals performing various behaviors. As long as you can see (label) what you want to track, you can use this toolbox, as it is animal and object agnostic. Read a short development and application summary below.

Please click the link above for all the information you need to get started! Please note that currently we support only Python 3.10+ (see conda files for guidance).

Developers Stable Release:

  • Very quick start: You need to have TensorFlow installed (up to v2.10 supported across platforms) pip install "deeplabcut[gui,tf]" that includes all functions plus GUIs, or pip install deeplabcut[tf] (headless version with PyTorch and TensorFlow).

Developers Alpha Release:

We recommend using our conda file, see here or the new deeplabcut-docker package.

Our docs walk you through using DeepLabCut, and key API points. For an overview of the toolbox and workflow for project management, see our step-by-step at Nature Protocols paper.

For a deeper understanding and more resources for you to get started with Python and DeepLabCut, please check out our free online course! http://DLCcourse.deeplabcut.org

🐭 pose tracking of single animals demo Open in Colab

🐭🐭🐭 pose tracking of multiple animals demo Open in Colab

  • See more demos here. We provide data and several Jupyter Notebooks: one that walks you through a demo dataset to test your installation, and another Notebook to run DeepLabCut from the beginning on your own data. We also show you how to use the code in Docker, and on Google Colab.

Why use DeepLabCut?

In 2018, we demonstrated the capabilities for trail tracking, reaching in mice and various Drosophila behaviors during egg-laying (see Mathis et al. for details). There is, however, nothing specific that makes the toolbox only applicable to these tasks and/or species. The toolbox has already been successfully applied (by us and others) to rats, humans, various fish species, bacteria, leeches, various robots, cheetahs, mouse whiskers and race horses. DeepLabCut utilized the feature detectors (ResNets + readout layers) of one of the state-of-the-art algorithms for human pose estimation by Insafutdinov et al., called DeeperCut, which inspired the name for our toolbox (see references below). Since this time, the package has changed substantially. The code has been re-tooled and re-factored since 2.1+: We have added faster and higher performance variants with MobileNetV2s, EfficientNets, and our own DLCRNet backbones (see Pretraining boosts out-of-domain robustness for pose estimation and Lauer et al 2022). Additionally, we have improved the inference speed and provided both additional and novel augmentation methods, added real-time, and multi-animal support. In v3.0+ we have changed the backend to support PyTorch. This brings not only an easier installation process for users, but performance gains, developer flexibility, and a lot of new tools! Importantly, the high-level API stays the same, so it will be a seamless transition for users πŸ’œ! We currently provide state-of-the-art performance for animal pose estimation and the labs (M. Mathis Lab and A. Mathis Group) have both top journal and computer vision conference papers.

Left: Due to transfer learning it requires little training data for multiple, challenging behaviors (see Mathis et al. 2018 for details). Mid Left: The feature detectors are robust to video compression (see Mathis/Warren for details). Mid Right: It allows 3D pose estimation with a single network and camera (see Mathis/Warren). Right: It allows 3D pose estimation with a single network trained on data from multiple cameras together with standard triangulation methods (see Nath* and Mathis* et al. 2019).

DeepLabCut is embedding in a larger open-source eco-system, providing behavioral tracking for neuroscience, ecology, medical, and technical applications. Moreover, many new tools are being actively developed. See DLC-Utils for some helper code.

Code contributors:

DLC code was originally developed by Alexander Mathis & Mackenzie Mathis, and was extended in 2.0 with the core dev team consisting of Tanmay Nath (2.0-2.1), and currently (2.1+) with Jessy Lauer and (2.3+) Niels Poulsen. DeepLabCut is an open-source tool and has benefited from suggestions and edits by many individuals including Mert Yuksekgonul, Tom Biasi, Richard Warren, Ronny Eichler, Hao Wu, Federico Claudi, Gary Kane and Jonny Saunders as well as the 100+ contributors. Please see AUTHORS for more details!

This is an actively developed package and we welcome community development and involvement.

Get Assistance & be part of the DLC Community✨:

πŸš‰ Platform 🎯 Goal ⏱️ Estimated Response Time πŸ“’ Support Squad
Image.sc forum
🐭Tag: DeepLabCut
To ask help and support questionsπŸ‘‹ PromptlyπŸ”₯ DLC Team and The DLC Community
GitHub DeepLabCut/Issues To report bugs and code issuesπŸ› (we encourage you to search issues first) 2-3 days DLC Team
Gitter To discuss with other users, share ideas and collaborateπŸ’‘ 2 days The DLC Community
GitHub DeepLabCut/Contributing To contribute your expertise and experienceπŸ™πŸ’― PromptlyπŸ”₯ DLC Team
🚧 GitHub DeepLabCut/Roadmap To learn more about our journey✈️ N/A N/A
Twitter Follow To keep up with our latest news and updates πŸ“’ Daily DLC Team
The DeepLabCut AI Residency Program To come and work with us next summerπŸ‘ Annually DLC Team

References:

If you use this code or data we kindly ask that you please cite Mathis et al, 2018 and, if you use the Python package (DeepLabCut2.x) please also cite Nath, Mathis et al, 2019. If you utilize the MobileNetV2s or EfficientNets please cite Mathis, Biasi et al. 2021. If you use versions 2.2beta+ or 2.2rc1+, please cite Lauer et al. 2022.

DOIs (#ProTip, for helping you find citations for software, check out CiteAs.org!):

Please check out the following references for more details:

@article{Mathisetal2018,
    title = {DeepLabCut: markerless pose estimation of user-defined body parts with deep learning},
    author = {Alexander Mathis and Pranav Mamidanna and Kevin M. Cury and Taiga Abe  and Venkatesh N. Murthy and Mackenzie W. Mathis and Matthias Bethge},
    journal = {Nature Neuroscience},
    year = {2018},
    url = {https://www.nature.com/articles/s41593-018-0209-y}}

 @article{NathMathisetal2019,
    title = {Using DeepLabCut for 3D markerless pose estimation across species and behaviors},
    author = {Nath*, Tanmay and Mathis*, Alexander and Chen, An Chi and Patel, Amir and Bethge, Matthias and Mathis, Mackenzie W},
    journal = {Nature Protocols},
    year = {2019},
    url = {https://doi.org/10.1038/s41596-019-0176-0}}
    
@InProceedings{Mathis_2021_WACV,
    author    = {Mathis, Alexander and Biasi, Thomas and Schneider, Steffen and Yuksekgonul, Mert and Rogers, Byron and Bethge, Matthias and Mathis, Mackenzie W.},
    title     = {Pretraining Boosts Out-of-Domain Robustness for Pose Estimation},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2021},
    pages     = {1859-1868}}
    
@article{Lauer2022MultianimalPE,
    title={Multi-animal pose estimation, identification and tracking with DeepLabCut},
    author={Jessy Lauer and Mu Zhou and Shaokai Ye and William Menegas and Steffen Schneider and Tanmay Nath and Mohammed Mostafizur Rahman and     Valentina Di Santo and Daniel Soberanes and Guoping Feng and Venkatesh N. Murthy and George Lauder and Catherine Dulac and M. Mathis and Alexander Mathis},
    journal={Nature Methods},
    year={2022},
    volume={19},
    pages={496 - 504}}

@article{insafutdinov2016eccv,
    title = {DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model},
    author = {Eldar Insafutdinov and Leonid Pishchulin and Bjoern Andres and Mykhaylo Andriluka and Bernt Schiele},
    booktitle = {ECCV'16},
    url = {http://arxiv.org/abs/1605.03170}}

Review & Educational articles:

@article{Mathis2020DeepLT,
    title={Deep learning tools for the measurement of animal behavior in neuroscience},
    author={Mackenzie W. Mathis and Alexander Mathis},
    journal={Current Opinion in Neurobiology},
    year={2020},
    volume={60},
    pages={1-11}}

@article{Mathis2020Primer,
    title={A Primer on Motion Capture with Deep Learning: Principles, Pitfalls, and Perspectives},
    author={Alexander Mathis and Steffen Schneider and Jessy Lauer and Mackenzie W. Mathis},
    journal={Neuron},
    year={2020},
    volume={108},
    pages={44-65}}

Other open-access pre-prints related to our work on DeepLabCut:

@article{MathisWarren2018speed,
    author = {Mathis, Alexander and Warren, Richard A.},
    title = {On the inference speed and video-compression robustness of DeepLabCut},
    year = {2018},
    doi = {10.1101/457242},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2018/10/30/457242},
    eprint = {https://www.biorxiv.org/content/early/2018/10/30/457242.full.pdf},
    journal = {bioRxiv}}

License:

This project is primarily licensed under the GNU Lesser General Public License v3.0. Note that the software is provided "as is", without warranty of any kind, express or implied. If you use the code or data, please cite us! Note, artwork (DeepLabCut logo) and images are copyrighted; please do not take or use these images without written permission.

SuperAnimal models are provided for research use only (non-commercial use).

Major Versions:

  • For all versions, please see here.

VERSION 3.0: A whole new experience with PyTorchπŸ”₯. While the high-level API remains the same, the backend and developer friendliness have greatly improved, along with performance gains!

VERSION 2.3: Model Zoo SuperAnimals, and a whole new GUI experience.

VERSION 2.2: Multi-animal pose estimation, identification, and tracking with DeepLabCut is supported (as well as single-animal projects).

VERSION 2.0-2.1: This is the Python package of DeepLabCut that was originally released in Oct 2018 with our Nature Protocols paper (preprint here). This package includes graphical user interfaces to label your data, and take you from data set creation to automatic behavioral analysis. It also introduces an active learning framework to efficiently use DeepLabCut on large experimental projects, and data augmentation tools that improve network performance, especially in challenging cases (see panel b).

VERSION 1.0: The initial, Nature Neuroscience version of DeepLabCut can be found in the history of git, or here: https://github.com/DeepLabCut/DeepLabCut/releases/tag/1.11

News (and in the news):

πŸ’œ We released a major update, moving from 2.x --> 3.x with the backend change to PyTorch

πŸ’œ The DeepLabCut Model Zoo launches SuperAnimals, see more here.

πŸ’œ DeepLabCut supports multi-animal pose estimation! maDLC is out of beta/rc mode and beta is deprecated, thanks to the testers out there for feedback! Your labeled data will be backwards compatible, but not all other steps. Please see the new 2.2+ releases for what's new & how to install it, please see our new paper, Lauer et al 2022, and the new docs on how to use it!

πŸ’œ We support multi-animal re-identification, see Lauer et al 2022.

πŸ’œ We have a real-time package available! http://DLClive.deeplabcut.org

deeplabcut's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplabcut's Issues

Running model evaluation "Step1_EvaluateModelonDataset.py"

I am encountering the following error when I try to check my model training by running Step1_EvaluateModelonDataset.py:

Traceback (most recent call last):
File "EvaluateNetwork.py", line 31, in
from nnet import predict
File "C:\Users\neyhart\DeepLabCut\pose-tensorflow\nnet\predict.py", line 3, in
import tensorflow as tf
File "C:\Users\neyhart\AppData\Local\Continuum\anaconda3\envs\DLCdependencies\lib\site-packages\tensorflow_init_.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\Users\neyhart\AppData\Local\Continuum\anaconda3\envs\DLCdependencies\lib\site-packages\tensorflow\python_init_.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "C:\Users\neyhart\AppData\Local\Continuum\anaconda3\envs\DLCdependencies\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 30, in
self_check.preload_check()
File "C:\Users\neyhart\AppData\Local\Continuum\anaconda3\envs\DLCdependencies\lib\site-packages\tensorflow\python\platform\self_check.py", line 70, in preload_check
% build_info.nvcuda_dll_name)
ImportError: Could not find 'nvcuda.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Typically it is installed in 'C:\Windows\System32'. If it is not present, ensure that you have a CUDA-capable GPU with the correct driver installed.

I followed the installation instructions for "Simple installation on windows without GPU support", so I am confused as to why I am getting an error regarding Cuda. I noticed that the DLCdependencies file contains "tensorflow-gpu 1.8.0", is this related to why I am getting this error?
Thank you!

Minor issues with reading in .mp4 videos

Hi there,

Here are a couple minor issues you may want to address.

There's a small error when running Step2_AnalysisofResults.py:

~/DeepLabCut-master/Evaluation-Tools$ python3 Step2_AnalysisofResults.py
Traceback (most recent call last):
  File "Step2_AnalysisofResults.py", line 92, in <module>
    for trainingiterations,index in snapindices:
TypeError: 'int' object is not iterable

I've circumventing it by putting square brackets around snapindices in line 92:

for trainingiterations,index in [snapindices]:

How can I see if that error is big or small?

~/DeepLabCut-master/Evaluation-Tools$ python3 Step2_AnalysisofResults.py
Results for 950000 training iterations: 95 1 train error: 0.8272971490056671 test error: 2.499064498562305

l115 in AnalyzeVideos.py, maybe you also want to accept other video formats, say mp4?
videos = np.sort([fn for fn in os.listdir(os.curdir) if (".mp4" in fn)])#was avi

same thing for line 73 in MakingLabeledVideo.py

videos = np.sort([fn for fn in os.listdir(os.curdir) if (".mp4" in fn)])#was .avi

Maybe you may want to write somewhere a rough overview of processing time. In my case it took 24h for training the net (cropped image, 2 points), number of frames times 15 frames/second for predicting 2 point positions, number of frames times 4 frames/second for labelling the video with the predicted points.

Anyway, those are just small comments, overall my example finger tracking worked flawlessly thanks to your great software. Many thanks for such transparent sharing!

Annotation method (skipping Step1 - using different image types or already extracted frames)

According to the readme I have tried to annotate some images, but the "format" that I seem to be getting differs from what I see in the demo csv files by a lot.

As I am using one file per folder, example from my Results.csv:

 ,Label,Area,Mean,Min,Max,X,Y,Slice
1,IMG_20180523_122904.jpg,0,15,15,15,1500.000,1638.000,1
2,IMG_20180523_122904.jpg,0,210,210,210,2160.000,1848.000,1
3,IMG_20180523_122904.jpg,0,15,15,15,2808.000,1740.000,1
4,IMG_20180523_122904.jpg,0,128,128,128,1182.000,1710.000,1
.......and so on

Although this might be something I should ask Fiji guys, I would still like to know how you did the annotations.( On a side note, I think the label thing is weird; why is it coming as image name when I have done that as numbers?)

Also, as Step 1,2,3,4 do not throw any error and generate the specific "separate part" .csv file,
running the train.py gives me with :

Config:
{'all_joints': [[0], [1], [2]],
 'all_joints_names': ['base', 'pivot', 'wrist'],
 'batch_size': 1,
 'crop': False,
 'crop_pad': 0,
 'dataset': '../../UnaugmentedDataSet_movingjuly26/moving_kinova95shuffle1.mat',
 'dataset_type': 'default',
 'display_iters': 1000,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '../../pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1000,
 'mean_pixel': [123.68, 116.779, 103.939],
 'mirror': False,
 'multi_step': [[0.005, 10000],
                [0.02, 430000],
                [0.002, 730000],
                [0.001, 1030000]],
 'net_type': 'resnet_50',
 'num_joints': 3,
 'optimizer': 'sgd',
 'pos_dist_thresh': 17,
 'regularize': False,
 'save_iters': 50000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.5,
 'scoremap_dir': 'test',
 'shuffle': True,
 'snapshot_prefix': './snapshot',
 'stride': 8.0,
 'use_gt_segm': False,
 'video': False,
 'video_batch': False,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
2018-06-26 10:31:07.888880: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-06-26 10:31:07.983375: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-06-26 10:31:07.983891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.8095
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 5.50GiB
2018-06-26 10:31:07.983939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "../../../train.py", line 47, in load_and_enqueue
    batch_np = dataset.next_batch()
  File "/root/deep3/DeepLabCut/pose-tensorflow/dataset/pose_dataset.py", line 158, in next_batch
    imidx, mirror = self.next_training_sample()
  File "/root/deep3/DeepLabCut/pose-tensorflow/dataset/pose_dataset.py", line 138, in next_training_sample
    self.curr_img = (self.curr_img + 1) % self.num_training_samples()
ZeroDivisionError: integer division or modulo by zero

EDIT: Also I annotated my images using Fiji and as specified.

Issue with AnalyseVideos.py (Moviepy installation problem)

Thanks in advance for any comments or suggestions.

Describe the problem
An error occurred in AnalyseVideos.py
After running the evaluation, I now have a file named "DeepCut_resnet_50_95shuffle1_50000forTask_reaching.h5" in Evaluation-Tools/Results folder.
I also have a video named "MovieS2_Perturbation_noLaser_compressed" in DeepLabCut/videos folder.
I called AnalyseVideos.py from Analysis-tools folder, and the path to videos in myconfig_analysis.py is set to videofolder = '../videos/'. Therefore, it should have the video file in it. However when I run the script I get an error saying that the file could not be found. Please help me!

Here is the exact error that appeared in terminal:

FileNotFoundError: File MovieS2_Perturbation_noLaser_compressedDeepCut_resnet50_
reachingJan30shuffle1_50000.h5 does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "AnalyzeVideos.py", line 124, in
clip = VideoFileClip(video)
File "C:\Users\annaL\AppData\Local\Programs\Python\Python35\lib\site-packages
moviepy\video\io\VideoFileClip.py", line 91, in init
fps_source=fps_source)
File "C:\Users\annaL\AppData\Local\Programs\Python\Python35\lib\site-packages
moviepy\video\io\ffmpeg_reader.py", line 33, in init
fps_source)
File "C:\Users\annaL\AppData\Local\Programs\Python\Python35\lib\site-packages
moviepy\video\io\ffmpeg_reader.py", line 272, in ffmpeg_parse_infos
"path.")%filename)
OSError: MoviePy error: the file MovieS2_Perturbation_noLaser_compressed.avi cou
ld not be found!
Please check that you entered the correct path.

.h5 File Cannot be Found in ../Evaluation-Tools/Results

Thank you so much for the help!!

Describe the problem
While running CUDA_VISIBLE_DEVICES=0 python3 Step1_EvaluateModelonDataset.py in Evaluation-Tools, the .h5 file that should be in the Results folder seems to be missing.

Previously, we had trained using a different dataset, and the .h5 file from that training is still in the Results folder. After terminating the second training, should the .h5 file for the second dataset be generated automatically in the Results folder? Will it cause any issues if there already exists a Results folder (from the first training set) in Evaluation-Tools?

Here is the exact error that appeared in terminal:
Traceback (most recent call last):
File "EvaluateNetwork.py", line 89, in
Data = pd.read_hdf(os.path.join("Results",DLCscorer + '.h5'),'df_with_missing')
File "/data/home/jli819/install/anaconda3/envs/deeplabcut/lib/python3.6/site-packages/pandas/io/pytables.py", line 371, in read_hdf
'File %s does not exist' % path_or_buf)
FileNotFoundError: File Results/DeepCut_resnet_50_95shuffle1_400000forTask_fish0817v3.h5 does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "EvaluateNetwork.py", line 111, in
outputs_np = sess.run(outputs, feed_dict={inputs: image_batch})
File "/data/home/jli819/install/anaconda3/envs/deeplabcut/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/data/home/jli819/install/anaconda3/envs/deeplabcut/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 944, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 972, 1296, 4) for Tensor 'Placeholder:0', which has shape '(1, ?, ?, 3)'

To Reproduce
Steps to reproduce the behavior:
Not sure if it can be described, but the .h5 file just doesn't seem to be generated.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Index Error on Step 1 EvaluateModelonDataset.py (tuple index out of range)

Describe the problem
When running step1.evaluatemodelondataset.py the following error occurs.

Starting evaluation
Running DeepCut_resnet_50_95shuffle1_400000forTask_movement with # of trainingiterations: 400000
WARNING:tensorflow:From /home/dave/DeepLabCut/pose-tensorflow/nnet/pose_net.py:52: calling resnet_arg_scope (from tensorflow.contrib.slim.python.slim.nets.resnet_utils) with is_training is deprecated and will be removed after 2017-08-01.
Instructions for updating:
Pass is_training directly to the network instead of the arg_scope.
WARNING:tensorflow:From /home/dave/DeepLabCut/pose-tensorflow/nnet/pose_net.py:52: calling resnet_arg_scope (from tensorflow.contrib.slim.python.slim.nets.resnet_utils) with is_training is deprecated and will be removed after 2017-08-01.
Instructions for updating:
Pass is_training directly to the network instead of the arg_scope.
2018-08-14 17:27:04.116584: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_UNKNOWN
2018-08-14 17:27:04.116624: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: 00e5be448947
2018-08-14 17:27:04.116636: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: 00e5be448947
2018-08-14 17:27:04.116688: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.130.0
2018-08-14 17:27:04.116722: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.130 Wed Mar 21 03:37:26 PDT 2018
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)
"""
2018-08-14 17:27:04.116749: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.130.0
2018-08-14 17:27:04.116764: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 384.130.0
Analyzing data...
0it [00:00, ?it/s]

Traceback (most recent call last):
File "EvaluateNetwork.py", line 89, in
Data = pd.read_hdf(os.path.join("Results",DLCscorer + '.h5'),'df_with_missing')
File "/usr/local/lib/python3.5/dist-packages/pandas/io/pytables.py", line 371, in read_hdf
'File %s does not exist' % path_or_buf)
FileNotFoundError: File Results/DeepCut_resnet_50_95shuffle1_400000forTask_movement.h5 does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "EvaluateNetwork.py", line 115, in
pose = predict.argmax_pose_predict(scmap, locref, cfg.stride)
File "/home/dave/DeepLabCut/pose-tensorflow/nnet/predict.py", line 43, in argmax_pose_predict
num_joints = scmap.shape[2]
IndexError: tuple index out of range
root@00e5be448947:/home/dave/DeepLabCut/Evaluation-Tools#

To Reproduce
Simply run Step 1 for the error to occur.

Additional context
Add any other context about the problem here.
This is our first training set with the software, we successfully trained for 400000 iterations and then attempted to evaluate the program.

No module named 'moviepy' (Windows installation)

I have probably done something stupid, but I believe I followed the installation instructions, and I got a "No module named moviepy error'

I've included everything that happened in the Anaconda Prompt (including my various stupid mistakes)

dlc_fail.txt

I've since "pip install moviepy" which seems to have fixed things, but I thought you might be interested to know anyway.

demo training data

When running the demo scripts I saw that some of the labelled images have labels in the wrong place.
img0-img43 are correct
Maybe this doesn't matter for running the scripts...wondering if this is normal
img119

Extracting Frames - Multiple videos

Hi,

When creating the training set, the first step is to extract a predetermined number of randomly selected frames from a video that will be subsequently manually labelled. To increase the effectiveness of the training, the training set should be composed of frames extracted from multiple videos. However, currently it is possible to extract frames only from one video at the time, it would be easier/faster if one could put all the training videos in a folder and have the code go through each of them automatically. So one would simply specify the folder with the videos and the code would extract the same number of frames from each video in the folder, without requiring further manual input.

Thank you,
Federico

Issue with Step 5 Training network on CPU (Accessing the "right" python)

Using Windows 10, I installed TensorFLow and then DeepLabCut via "simplified installation with conda environments for a CPU". In trying to run through the demo, I got through Step4 just fine, however when I try to train the network I get the following error:
python: can't open file 'TF_CUDNN_USE_AUTOTUNE=0': [Errno 2] No such file or directory
I'm not sure what is going wrong (I am a beginner with python). I am not using an NVIDIA graphics card, but I don't know if my problem is related to this or not.

Multiple predictions per image

Hi,

Great software and am enjoying using it.

However, I am wondering if you could help with how multiple predictions could be achieved per image for multiple features, e.g. if doing an assay that has two animals that are similar, and you want to track both of them. In Fig.4 of your preprint it appears you achieve this reasonably well for a social behaviour assay using three mice, although it is not specified what modification was necessary to achieve multiple predictions.

Ideally, I would like an output file which shows the x,y locations at which the network makes many predictions of a particular feature, and then to threshold based on the likelihood to give the features in the environment.

Is there a way to extract values corresponding to multiple potential predictions in a single image based on the output of the current getpose() function and the predicty.py file found in nnet?

Thanks,
Daniel

Snapshots not getting saved (Make sure 'save_iters': 50000 is small enough so that you get intermediate results when you want them).

I'm really sorry for bothering, but I seem to be having troubles left and right.

I have successfully completed all steps before running ../../train.py and no errors were thrown, and as mentioned in #13, there is a .h5 and .csv file.

But there are no snapshots getting generated in the .models/<model-name>/train folder. It does contain a log file. Here are the contents:

2018-06-27 05:55:25 Config:
{'all_joints': [[0], [1], [2]],
 'all_joints_names': ['base', 'pivot', 'hand'],
 'batch_size': 1,
 'crop': False,
 'crop_pad': 0,
 'dataset': '../../UnaugmentedDataSet_movingJuly26/moving_kinova95shuffle1.mat',
 'dataset_type': 'default',
 'display_iters': 1000,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '../../pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1000,
 'mean_pixel': [123.68, 116.779, 103.939],
 'mirror': False,
 'multi_step': [[0.005, 10000],
                [0.02, 430000],
                [0.002, 730000],
                [0.001, 1030000]],
 'net_type': 'resnet_50',
 'num_joints': 3,
 'optimizer': 'sgd',
 'pos_dist_thresh': 17,
 'regularize': False,
 'save_iters': 50000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.5,
 'scoremap_dir': 'test',
 'shuffle': True,
 'snapshot_prefix': './snapshot',
 'stride': 8.0,
 'use_gt_segm': False,
 'video': False,
 'video_batch': False,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
2018-06-27 05:55:27 From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/training/python/training/training.py:412: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_or_create_global_step
2018-06-27 05:55:29 Restoring parameters from ../../pretrained/resnet_v1_50.ckpt
2018-06-27 05:55:30 iteration: 0 loss: 0.0009 lr: 0.005
2018-06-27 05:57:37 iteration: 1000 loss: 0.0170 lr: 0.005
2018-06-27 05:59:40 iteration: 2000 loss: 0.0058 lr: 0.005
2018-06-27 06:01:45 iteration: 3000 loss: 0.0042 lr: 0.005
2018-06-27 06:03:50 iteration: 4000 loss: 0.0033 lr: 0.005
2018-06-27 06:05:55 iteration: 5000 loss: 0.0028 lr: 0.005
2018-06-27 06:08:01 iteration: 6000 loss: 0.0025 lr: 0.005
2018-06-27 06:10:09 iteration: 7000 loss: 0.0023 lr: 0.005
2018-06-27 06:12:15 iteration: 8000 loss: 0.0021 lr: 0.005
2018-06-27 06:14:22 iteration: 9000 loss: 0.0020 lr: 0.005
2018-06-27 06:16:27 iteration: 10000 loss: 0.0018 lr: 0.005
2018-06-27 06:18:37 iteration: 11000 loss: 0.0044 lr: 0.02
2018-06-27 06:20:44 iteration: 12000 loss: 0.0028 lr: 0.02
2018-06-27 06:22:50 iteration: 13000 loss: 0.0023 lr: 0.02
2018-06-27 06:24:54 iteration: 14000 loss: 0.0020 lr: 0.02
2018-06-27 06:27:00 iteration: 15000 loss: 0.0018 lr: 0.02
2018-06-27 06:29:05 iteration: 16000 loss: 0.0017 lr: 0.02
2018-06-27 06:31:11 iteration: 17000 loss: 0.0016 lr: 0.02
2018-06-27 06:33:16 iteration: 18000 loss: 0.0015 lr: 0.02
2018-06-27 06:35:24 iteration: 19000 loss: 0.0014 lr: 0.02
2018-06-27 06:37:33 iteration: 20000 loss: 0.0013 lr: 0.02
2018-06-27 06:39:37 iteration: 21000 loss: 0.0013 lr: 0.02
2018-06-27 06:41:44 iteration: 22000 loss: 0.0012 lr: 0.02
2018-06-27 06:43:52 iteration: 23000 loss: 0.0012 lr: 0.02

..followed by Ctrl+C, which for the case of demo generates some checkpoint files in ../model/<model-name>/train/, but in my case it doesn't.

This whole issue came to my attention when I ran Evaluation Step1, which threw an Index out of bounds error, which led me to check the snapshot folder.

EDIT:

the folder ../model//train seems to contain :

....models/movingJuly26-trainset95shuffle1/train# ls
log  log.txt  pose_cfg.yaml

ls ./log
events.out.tfevents.1530098506.<system name>

EDIT 2: I haven't checked the code enough to see if the system saves checkpoints after a set number of iterations, and if I Ctrl+C before that I won't get snapshots.

train.py : line 130
# Save snapshot if (it % cfg.save_iters == 0 and it != 0) or it == max_iter:

IndexError running "Step1_EvaluateModelonDataset.py" (Installation specific python3 vs python call)

Hello,
When trying to run above script (Step1_EvaluateModelonDataset.py), having "Ctrl + C" from the ../../../train.py script after x iterations, I run into the following error:

Traceback (most recent call last):
File "Step1_EvaluateModelonDataset.py", line 72, in
cfg['init_weights'] = os.path.join(modelfolder,'train',Snapshots[snapIndex])
IndexError: index -1 is out of bounds for axis 0 with size 0

I have reproduced this on both Ubuntu 16.04 LTS and Windows 10 - any ideas?
Thank you!
Sotiris

Step2_.._demo.ipynb: cannot convert float NaN to integer

Tying to test the demo with the demo data using Jupyter notebook. Started from the step 3 as specified in the README, I just ran the file Step2_.._demo.ipynb file and got this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-a7733a65592b> in <module>()
     58                 # get rid of values that are invisible >> thus user scored in left corner!
     59                 invisiblemarkersmask = (Xrescaled < invisibleboundary) * (Yrescaled < invisibleboundary)
---> 60                 Xrescaled[invisiblemarkersmask] = np.nan
     61                 Yrescaled[invisiblemarkersmask] = np.nan
     62 

ValueError: cannot convert float NaN to integer

Output for Step 7

Hi Alex,

Everything seems to work fine until Step 7. Training of Resnet was good.

When I run step 7 according to your instruction for Step 1 evaluation, however, it is found that the program breaks when it analyzes the first image, with a message of "Folders already exist". Then running step 2 in the evaluation tools give me this abnormal error. The screen dump is attached for reference.

Please kindly advise. Deeply appreciate your support.

Thanks in advance.

Logic in pose-tensorflow/nnet/pose_net.py fails for tensorflow vsn. 1.10

Describe the bug
The file "pose-tensorflow/nnet/pose_net.py" tests for tensorflow version using logic that worked great for versions up to 1.9, but fails for versions beyond that.

To Reproduce
Steps to reproduce the behavior:

  1. Follow the instructions for training on the website, using tensorflow vsn. 1.10
  2. Run the "train.py" script
  3. Observe error about "False" not being a legal value for scale.

To Fix
(Sorry, I could have created a pull request, but I didn't want to clone the whole project just for one trivial bug.)
Replace code
vers = tf.version
if float(vers[0:3]) < 1.4:
(line 51 and 52 in pose-tensorflow/nnet/pose_net.py)
with something like
vers = tf.version
vers = vers.split(".")
if float(vers[0]) < 1 or (float(vers[0])==1 and float(vers[1])<10):
Thanks for great code. You are helping my student very kindly.
Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

nvidia-docker problems (DOCKER Question + GPU driver installation)

I am having trouble launching the docker after installation. Here is the line where I get an error:

GPU=1 bash ./dlc-docker run -d -p 2351:8888 -e USER_HOME=$HOME/DeepLabCut --name alex_GPU1 dlc_user/dlc_tf1.2

The error is:

initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.

I believe the issue is that nvidia-docker is no longer supported, and I had to install nvidia-docker2. The error seems to come when nvidia-docker is called from ./dlc docker.

Analysis completely failed. Not sure what I did wrong (setting myconfig files...)

So after training my network for a week on a CPU, I used MakingLabeledVideo_fast.py, however, the output video is identical to the input video, i.e. no parts are labelled.

You can see there was a slight error when the code ran, but it didn't look fatal to me.

I then ran through the code manually and I've seen a big problem

On line 65 of MakingLabeledVideo_fast.py:
bodyparts2plot = list(np.unique(Dataframe.columns.get_level_values(1)))
The value of bodyparts2plot is ['Finger1', 'Finger2', 'Joystick', 'hand'], which are the values from your supplied data, not what I entered in myconfig.py

Obviously, somewhere something has been pointed to the wrong thing, but I have no idea where.* Clues that might help in the directorys:
C:\Users\wmkc\Documents\DeepLabCut\Evaluation-Tools\LabeledImages_DeepCut_resnet_50_95shuffle1_170000forTask_reachingbill

and

C:\Users\wmkc\Documents\DeepLabCut\Generating_a_Training_Set\data-reachingbill\datalabeled

The training data is appropriately labelled.

The other big thing is that my network was trained 170000 iterations, not 37500 iterations. Which again, is what I think your one is trained to. So where have I gone wrong?

Any help deeply appreciated.

*For what it's worth, this is consistently the area I am having problems with while using DLC. I'm of course extremely grateful for your supplying the community with your code. But I have not found the instructions on how to set up the config files, or the path structure clear. For instance in myconfig.py, lines 14 through 18, it is not clear what these values should be. Are they relative paths, [if so, relative to what?] complete paths, and what should be in this path?

(DLCdependencies) C:\Users\wmkc\Documents\DeepLabCut\Analysis-tools>python MakingLabeledVideo_fast.py
Starting ../videos/ ['mouse_reaching_crop.avi']
Loading mouse_reaching_crop.avi and data.
The video was not analyzed with this scorer: DeepCut_resnet50_reachingJan30shuffle1_500
Other scorers were found, however: ['mouse_reaching_cropDeepCut_resnet50_reachingJan30shuffle1_37500.h5']
Creating labeled video for: mouse_reaching_cropDeepCut_resnet50_reachingJan30shuffle1_37500.h5 instead.
Duration of video [s]: 195.02 , recorded with 50.0 fps!
Overall # of frames: 9751 with cropped frame dimensions: 358 465
Generating frames
100%|#############################################################################| 9751/9751 [01:09<00:00, 139.41it/s]

Describe the problem
A clear and concise description of what the problem is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Step1_EvaluateModelon...py: Error when snapshotindex = all

To evaluate all models (snapshots of training stages) I set on myconfig.py the variable snapshotindex = all.

There is a variable mistake in the Step1_EvaluateModelonDataset.py file, since the conditioned variable snapindex was not used, but the imported variable snapshotindex was.
Correction:

        ##################################################
        # Compute predictions over images
        ##################################################

        if snapshotindex == -1:
            snapindices = [-1]
        elif snapshotindex == all:
            snapindices = range(len(Snapshots))
        else:
            print("Invalid choice, only -1 or all!")

        for snapindex in snapindices:
            cfg['init_weights'] = modelfolder + \
                '/train/' + Snapshots[snapindex] # xxx[snapshotindex]
            trainingsiterations = (
                cfg['init_weights'].split('/')[-1]).split('-')[-1]
            scorer = 'DeepCut' + "_resnet" + str(cfg["net_type"]) + "_" + str(
                int(trainFraction *
                    100)) + 'shuffle' + str(shuffle) + '_' + str(
                        trainingsiterations) + "forTask:" + Task

            print("Running ", scorer, trainingsiterations)

            try:
                Data = pd.read_hdf('Data_h5/' + scorer + '.h5',
                                   'df_with_missing')
                print("This net has already been evaluated!")
            except:
                # Specifying state of model (snapshot / training state)
                cfg['init_weights'] = modelfolder + \
                    '/train/' + Snapshots[snapindex] # xxx[snapshotindex]
                sess, inputs, outputs = predict.setup_pose_prediction(cfg)

However, there is still a tf error when running this script in this mode:

Traceback (most recent call last):
  File "Step1_EvaluateModelonDataset.py", line 132, in <module>
    sess, inputs, outputs = **predict**.setup_pose_prediction(cfg)
...

**_ValueError:** Variable resnet_v1_50/conv1/weights already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

  File "/tensorflow-3.0/local/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 216, in variable
    use_resource=use_resource)

...

To solve this what I did (it worked but not sure if it is the best solution) is to reset the graph for each new iteration of snapshots. By adding tf.reset_default_graph() to the method setup_pose_prediction(cfg) of the file DeepLabCut/pose-tensorflow/nnet/predict.py


def setup_pose_prediction(cfg):
    tf.reset_default_graph() # NEW LINE
    inputs = tf.placeholder(tf.float32, shape=[cfg.batch_size   , None, None, 3])

    net_heads = pose_net(cfg).test(inputs)
    outputs = [net_heads['part_prob']]
    if cfg.location_refinement:
        outputs.append(net_heads['locref'])

    restorer = tf.train.Saver()

    sess = tf.Session()

    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())

    # Restore variables from disk.
    restorer.restore(sess, cfg.init_weights)

    return sess, inputs, outputs

Bare except

In the process of debugging an issue, found a bare except. Generally these are not good as they can capture errors like KeyboardInterrupt or ResourceError, which it is likely not intended to capture. Instead would be better if a specific exception type were added here.

Lost in how to write myconfig.py (Where to put self-extracted frames & csv files?)

I've gone through an made a folder that is full of img000.png .... image200.png and labelled all of the parts in Fiji, [generating a randomly selected set wouldn't work very well here] and saved it in Results.csv, all of which are in

C:\Users\wmkc\Documents\DeepLabCut\Generating_a_Training_Set\bill-reaching

So that means
vidpath = '.'
filename = 'reachingvideo1.avi'

are irrelevant, as I don't need DLC to select my frames.

I've set cropping = False, as I've already cropped the video.

I've set multibodypartsfilename like this:
multibodypartsfilename="C:\Users\wmkc\Documents\DeepLabCut\Generating_a_Training_Set\bill-reaching\Results.csv"
As I can't see where else the directory was set

But where do I set where the PNGs are stored?

incomplete masking

In Step2_ConvertingLabels2Dataframe, lines 107-111 turns points within the top left corner into nans:

# get rid of values that are invisible >> thus user scored in left corner!
Xrescaled[(Xrescaled < invisibleboundary) *
                 (Yrescaled < invisibleboundary)] = np.nan
Yrescaled[(Xrescaled < invisibleboundary) *
                 (Yrescaled < invisibleboundary)] = np.nan

but since Xrescaled is modified by the first call, the Xrescaled<invisibleboundary fails, causing all the Y coordinates to remain in the resulting dataframe.

This is fixed by:

nan_inds = (Xrescaled < invisibleboundary) * (Yrescaled < invisibleboundary)
Xrescaled[nan_inds] = np.nan
Yrescaled[nan_inds] = np.nan

counting frames in MakingLabeledVideo is slow

quick fix - line 93 in MakingLaveledVideo is:

nframes = np.sum(1 for j in clip.iter_frames())

which is slow (8.4s for a 2 minute video) because it has to iterate through frames in the video.

nframes = np.ceil(clip.fps*clip.duration).astype(np.int)

gives identical results in 0.0002 seconds.

Issue with Step 6 (TensorFlow installation problem)

Hi,
Your work sounds so interesting. I have got a TitanXP GPU and using Linux 18.04. I installed all the required packages and tensorflow-gpu. The GPU can be found by tensorflow and all first five steps can be run. But, on step6, I get the following error after making chunks for training:

Unfortunately, I made a mistake and instead of a new issue, which I was writing, I modified this one (I could not recover it so I removed the information; hopeful, you can recover it). sorry for the mistake.

Any idea what can be the issue?
Thank you!

Step3_CheckLabels.py KeyError: 'No object named df_with_missing in the file' (Images not correctly loaded in step2)

Describe the problem
When running "Step3_CheckLabels.py", the following is outputted to the terminal.
$ python3 Step3_CheckLabels.py
4
<map object at 0x7f12408cdac8>
['nose', 'leftEye', 'rightEye', 'fin']
Traceback (most recent call last):
File "Step3_CheckLabels.py", line 65, in
'CollectedData_' + cfg_scorer + '.h5', 'df_with_missing')
File "/data/home/jli819/install/anaconda3/envs/deeplabcut/lib/python3.6/site-packages/pandas/io/pytables.py", line 394, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File "/data/home/jli819/install/anaconda3/envs/deeplabcut/lib/python3.6/site-packages/pandas/io/pytables.py", line 723, in select
raise KeyError('No object named %s in the file' % key)
KeyError: 'No object named df_with_missing in the file'

To Reproduce
Steps to reproduce the behavior:

  1. Create and label images in FIJI.
  2. Run "Step2_ConvertingLabels2DataFrame.py"
    (The following is produced in the data-Task folder:
    -rw-r--r--. 1 jli819 2626 156 Jul 5 16:27 CollectedData_Trinity.csv
    -rw-r--r--. 1 jli819 2626 1.0K Jul 5 16:27 CollectedData_Trinity.h5
    drwxr-xr-x. 2 jli819 2626 8.0K Jul 5 16:26 fishvideo2)
  3. Run "Step3_CheckLabels.py"
  4. Error is produced

Additional Information
We have been skipping step 1 and labelling the images ourselves using FIJI. In our CSV file, the slices do not correspond to the image orders since some images were skipped and deleted. (For example, the slices may be 1, 2, 3, 6, 8, etc... and the images associated with slices 4 and 5 have been deleted from the folder.)
In addition, we were able to train the CNN last week by labelling the images ourselves, so we are not sure why we are suddenly getting this error and smaller .h5 files produced in step 2. (The .h5 file produced in Step2 from the first time we ran it when it worked was 83K)

modules need __init__.py files

Howdy - the imports within the package (eg. from myconfig import ... ) fail because there are no __init__.py files in any of the directories. Simple fix, I'll PR if you don't get to it first <3

Does not reach first iteration of training. (Only edit pose_config.yaml manually if you do so consistently)

Describe the problem
When training on labeled data, the code seems to get stuck after establishing the connection with the GPU. This only happens occasionally, since I was able to train on a different data set. When i look at the nvidia-smi the GPU does not appear to be processing anything even though the memory is reserved.

To Reproduce
Steps to reproduce the behavior:

  1. Follow the Generating training data steps to create and begin training.
  2. edit some parts but not others.

Screenshots
log.txt file when program gets stuck.

2018-08-14 23:47:21 Config:
{'all_joints': [[0], [1], [2], [3], [4], [5], [6]],
'all_joints_names': ['P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7'],
'batch_size': 1,
'crop': False,
'crop_pad': 0,
'dataset': '../../UnaugmentedDataSet_FishTailAug14/FishTail_Elliott95shuffle1.mat',
'dataset_type': 'default',
'display_iters': 1000,
'fg_fraction': 0.25,
'global_scale': 0.8,
'init_weights': '../../pretrained/resnet_v1_50.ckpt',
'intermediate_supervision': False,
'intermediate_supervision_layer': 12,
'location_refinement': True,
'locref_huber_loss': True,
'locref_loss_weight': 0.05,
'locref_stdev': 7.2801,
'log_dir': 'log',
'max_input_size': 1000,
'mean_pixel': [123.68, 116.779, 103.939],
'mirror': False,
'multi_step': [[0.005, 10000],
[0.02, 430000],
[0.002, 730000],
[0.001, 1030000]],
'net_type': 'resnet_50',
'num_joints': 6,
'optimizer': 'sgd',
'pos_dist_thresh': 17,
'regularize': False,
'save_iters': 50000,
'scale_jitter_lo': 0.5,
'scale_jitter_up': 1.5,
'scoremap_dir': 'test',
'shuffle': True,
'snapshot_prefix': './snapshot',
'stride': 8.0,
'use_gt_segm': False,
'video': False,
'video_batch': False,
'weigh_negatives': False,
'weigh_only_present_joints': False,
'weigh_part_predictions': False,
'weight_decay': 0.0001}
2018-08-14 23:47:21 From /home/eabe/DeepLabCut/pose-tensorflow/nnet/pose_net.py:52: calling resnet_arg_scope (from tensorflow.contrib.slim.python.slim.nets.resnet_utils) with is_training is deprecated and will be removed after 2017-08-01.
Instructions for updating:
Pass is_training directly to the network instead of the arg_scope.
2018-08-14 23:47:23 logits.dtype=<dtype: 'float32'>.
2018-08-14 23:47:23 multi_class_labels.dtype=<dtype: 'float32'>.
2018-08-14 23:47:23 losses.dtype=<dtype: 'float32'>.
2018-08-14 23:55:32 Restoring parameters from ../../pretrained/resnet_v1_50.ckpt

Additional context
Training on 77 images, I have tried with 3 points and 7 points and still gets stuck. It seems like it is getting stuck when loading in the .mat file. When switching out the path to another one and changing to the appropriate number of joints, it works. I have tried recreating the training set and still does not get past the initial set up.

Running demo "name 'clip' is not defined" (Problem when scorer is not correctly specified during MakingLabeledVideo.py)

I am trying to run the demo. But after running MakeLabeledVideo.py i get the following error:

Traceback (most recent call last):
  File "MakingLabeledVideo.py", line 92, in <module>
    ny, nx = clip.size  # dimensions of frame (height, width)
NameError: name 'clip' is not defined

I have followed step (3) to (8)to run the code on demo video.

Ubuntu 16, CUDA 8 , CUDNN 5, python 3.5
I think it is an issue with moviepy but i am not sure.

Cannot find installation of real FFmpeg (scikit-video installation)

My understanding is that FFmpeg was installed during DLC installation, so I'm not sure what to do about this. Any ideas appreciated

EDIT: I was able to manually install FFMPEG, and set the PATH variables to this. But it seems like this isn't the ideal approach.

(DLCdependencies) C:\Users\wmkc\Documents\DeepLabCut\Analysis-tools>python MakingLabeledVideo_fast.py
Starting ../videos/ ['mouse_reaching_crop.avi']
Error: %s Cannot find installation of real FFmpeg (which comes with ffprobe).
Loading mouse_reaching_crop.avi and data.
The video was not analyzed with this scorer: DeepCut_resnet50_reachingJan30shuffle1_500
Other scorers were found, however: ['mouse_reaching_cropDeepCut_resnet50_reachingJan30shuffle1_37500.h5']
Creating labeled video for: mouse_reaching_cropDeepCut_resnet50_reachingJan30shuffle1_37500.h5 instead.
Error: %s Cannot find installation of real FFmpeg (which comes with ffprobe).
Duration of video [s]: 325.03333333333336 , recorded with 30 fps!
Overall # of frames: 9751 with cropped frame dimensions: 0 0
Generating frames
0%| | 0/9751 [00:00<?, ?it/s]Error: %s 'VideoProcessorSK' object has no attribute 'vid'

Traceback (most recent call last):
File "MakingLabeledVideo_fast.py", line 110, in
Dataframe = pd.read_hdf(dataname)
File "C:\Users\wmkc\AppData\Local\Continuum\anaconda3\envs\DLCdependencies\lib\site-packages\pandas\io\pytables.py", line 371, in read_hdf
'File %s does not exist' % path_or_buf)
FileNotFoundError: File mouse_reaching_cropDeepCut_resnet50_reachingJan30shuffle1_500.h5 does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "MakingLabeledVideo_fast.py", line 125, in
CreateVideo(clip,Dataframe)
File "MakingLabeledVideo_fast.py", line 88, in CreateVideo
clip.save_frame(frame)
File "C:\Users\wmkc\Documents\DeepLabCut\Analysis-tools\VideoProcessor.py", line 137, in save_frame
self.svid.writeFrame(frame)
AttributeError: 'VideoProcessorSK' object has no attribute 'svid'

Warning after running Step2_ConvertingLabels2DataFrame.py

I just wanted to know if this warning would cause any potential problems:

It happened after I ran $python3 Step2_ConvertingLabels2DataFrame.py in cmd on ubuntu

the warning:

/home/gcp_instance-2/miniconda3/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/home/gcp_instance-2/miniconda3/lib/python3.6/importlib/_bootstrap.py:219: Runti meWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
Not all data converted!
Merging scorer's data.

Thank you

Step2 Converting labels | ValueError: cannot reindex from a duplicate axis

Describe the problem
I'm trying to create a dataset to train a multi-person tracker. I've followed the tutorial to a T but seem to be hitting an error. I'm not really sure what is causing it to happen.

The images I used have multiple of the same body part but on different subjects. I think this might be what is causing the problem, but I'm not sure

I've attached the traceback below. Would appreciate some feedback.

Additional context
ValueError Traceback (most recent call last)
in ()
50 # dframe.set_index('Slice')
51 new_index=pd.Index(np.arange(len(files))+1,name='Slice')
---> 52 dframe=dframe.set_index('Slice').reindex(new_index)
53 dframe=dframe.reset_index()
54

~/anaconda3/lib/python3.6/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
184 @wraps(func)
185 def wrapper(*args, **kwargs):
--> 186 return func(*args, **kwargs)
187
188 if not PY2:

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in reindex(self, *args, **kwargs)
3561 kwargs.pop('axis', None)
3562 kwargs.pop('labels', None)
-> 3563 return super(DataFrame, self).reindex(**kwargs)
3564
3565 @appender(_shared_docs['reindex_axis'] % _shared_doc_kwargs)

~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
3683 # perform the reindex on the axes
3684 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 3685 fill_value, copy).finalize(self)
3686
3687 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
3496 if index is not None:
3497 frame = frame._reindex_index(index, method, copy, level,
-> 3498 fill_value, limit, tolerance)
3499
3500 return frame

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _reindex_index(self, new_index, method, copy, level, fill_value, limit, tolerance)
3507 return self._reindex_with_indexers({0: [new_index, indexer]},
3508 copy=copy, fill_value=fill_value,
-> 3509 allow_dups=False)
3510
3511 def _reindex_columns(self, new_columns, method, copy, level,

~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
3804 fill_value=fill_value,
3805 allow_dups=allow_dups,
-> 3806 copy=copy)
3807
3808 if copy and new_data is self._data:

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
4412 # some axes don't allow reindexing with dups
4413 if not allow_dups:
-> 4414 self.axes[axis]._can_reindex(indexer)
4415
4416 if axis >= self.ndim:

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
3557 # trying to reindex on an axis with duplicates
3558 if not self.is_unique and len(indexer):
-> 3559 raise ValueError("cannot reindex from a duplicate axis")
3560
3561 def reindex(self, target, method=None, level=None, limit=None,

ValueError: cannot reindex from a duplicate axis

issue running imageio in step 1 (imageio installation for MacOS)

Hi,

I am trying to set up your pipeline on Mac OS El Capitan 10.11.6. I installed Jupyter Notebook to run the first step. When I run "imageio.plugins.ffmpeg.download()" I get the following error:
"Imageio: 'ffmpeg-osx-v3.2.4' was not found on your computer; downloading it now.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>.

IOError Traceback (most recent call last)
in ()
3 get_ipython().magic(u'matplotlib inline')
4 import imageio
----> 5 imageio.plugins.ffmpeg.download()
6 import matplotlib
7 import matplotlib.pyplot as plt

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/imageio/plugins/ffmpeg.pyc in download(directory, force_download)
71 get_remote_file(fname=fname,
72 directory=directory,
---> 73 force_download=force_download)
74
75

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/imageio/core/fetching.pyc in get_remote_file(fname, directory, force_download, auto)
125 return filename
126 else: # pragma: no cover
--> 127 _fetch_file(url, filename)
128 return filename
129

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/imageio/core/fetching.pyc in _fetch_file(url, file_name, print_destination)
181 raise IOError('Unable to download %r. Perhaps there is a no internet '
182 'connection? If there is, please report this problem.' %
--> 183 os.path.basename(file_name))
184
185

IOError: Unable to download 'ffmpeg-osx-v3.2.4'. Perhaps there is a no internet connection? If there is, please report this problem."

Did anything like that happen to you? Do you have a fix for it? I tried everything I could find online, but it did not help.
Thanks,
Marie

IndexError running "Step1_EvaluateModelonDataset.py" (Run training until at least 1 snapshot is stored before proceeding to "Step1_EvaluateModelonDataset.py")

The error produced is:
" File "/data/home/jli819/deeplabcut/DeepLabCut/Evaluation-Tools/Step1_EvaluateModelonDataset.py", line 68, in
cfg['init_weights'] = os.path.join(modelfolder,'train',Snapshots[snapIndex])
IndexError: index -1 is out of bounds for axis 0 with size 0"

We saw the other issue with an IndexError running the same file, but their solution did not fix our problem.

Did anyone else also run into this error? If so, what is a possible fix? Error occurred on Red Hat Enterprise Linux Server release 7.5 (Maipo).

Thanks!!

AttributeError: 'DataFrame' object has no attribute 'X'

Hello,

When running 'Step2_ConvertingLabels2DataFrame.py' I had to comment out the ", sep='\t'" in the line that says:

dframe = pd.read_csv(datafile + ".xls")#, sep='\t'

Else I got that error:

AttributeError: 'DataFrame' object has no attribute 'X'

After the running step3, my labelled images look fine.

I use Python 3.6.4 and Pandas 0.22.0.

Cheers

Running into a ResourceExhaustedError (Avoid having multiple DeepLabCuts per GPU)

When running step 8 from the README on the example data, we encountered the following error. Traceback below. Used a stock Anaconda install with tensorflow-gpu on a Titan X. Can provide more details if needed. Any help with this issue would be appreciated.

Traceback:
W tensorflow/core/common_runtime/bfc_allocator.cc:274] ******************x************************_**_*****______***********x***********xxxx***********xxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 9.00MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:993] Resource exhausted: OOM when allocating tensor with shape[3,3,512,512]
Traceback (most recent call last):
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/contextlib.py", line 88, in __exit__
    next(self.gen)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3,3,512,512]
	 [[Node: resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Assign = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights, resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Initializer/truncated_normal)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "AnalyzeVideos.py", line 106, in <module>
    sess, inputs, outputs = predict.setup_pose_prediction(cfg)
  File "/groups/dudman/home/riccellit/Developer/DeepLabCut-master/pose-tensorflow/nnet/predict.py", line 20, in setup_pose_prediction
    sess.run(tf.global_variables_initializer())
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3,3,512,512]
	 [[Node: resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Assign = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights, resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Initializer/truncated_normal)]]

Caused by op 'resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Assign', defined at:
  File "AnalyzeVideos.py", line 106, in <module>
    sess, inputs, outputs = predict.setup_pose_prediction(cfg)
  File "/groups/dudman/home/riccellit/Developer/DeepLabCut-master/pose-tensorflow/nnet/predict.py", line 11, in setup_pose_prediction
    net_heads = pose_net(cfg).test(inputs)
  File "/groups/dudman/home/riccellit/Developer/DeepLabCut-master/pose-tensorflow/nnet/pose_net.py", line 89, in test
    heads = self.get_net(inputs)
  File "/groups/dudman/home/riccellit/Developer/DeepLabCut-master/pose-tensorflow/nnet/pose_net.py", line 85, in get_net
    net, end_points = self.extract_features(inputs)
  File "/groups/dudman/home/riccellit/Developer/DeepLabCut-master/pose-tensorflow/nnet/pose_net.py", line 54, in extract_features
    global_pool=False, output_stride=16)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 248, in resnet_v1_50
    scope=scope)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 203, in resnet_v1
    net = resnet_utils.stack_blocks_dense(net, blocks, output_stride)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
    return func(*args, **current_args)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 218, in stack_blocks_dense
    rate=rate)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
    return func(*args, **current_args)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 118, in bottleneck
    residual, depth_bottleneck, 3, stride, rate=rate, scope='conv2')
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 131, in conv2d_same
    scope=scope)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
    return func(*args, **current_args)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 907, in convolution
    outputs = layer.apply(inputs)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 303, in apply
    return self.__call__(inputs, **kwargs)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 269, in __call__
    self.build(input_shapes[0])
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/convolutional.py", line 138, in build
    dtype=self.dtype)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 988, in get_variable
    custom_getter=custom_getter)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 890, in get_variable
    custom_getter=custom_getter)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in get_variable
    validate_shape=validate_shape)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 258, in variable_getter
    variable_getter=functools.partial(getter, **kwargs))
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 208, in _add_variable
    trainable=trainable and self.trainable)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1310, in layer_variable_getter
    return _model_variable_getter(getter, *args, **kwargs)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1299, in _model_variable_getter
    custom_getter=getter)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
    return func(*args, **current_args)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 268, in model_variable
    partitioner=partitioner, custom_getter=custom_getter)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
    return func(*args, **current_args)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 225, in variable
    partitioner=partitioner)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 333, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 684, in _get_single_variable
    validate_shape=validate_shape)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 197, in __init__
    expected_shape=expected_shape)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 305, in _init_from_args
    validate_shape=validate_shape).op
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
    use_locking=use_locking, name=name)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/groups/dudman/home/riccellit/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[3,3,512,512]
	 [[Node: resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Assign = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights, resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/weights/Initializer/truncated_normal)]]
Environment:
name: base
channels:
  - defaults
  - conda-forge
dependencies:
  - boost-cpp=1.66.0=1
  - moviepy=0.2.3.5=py_0
  - msgpack-c=2.1.5=0
  - sk-video=1.1.10=py_3
  - _ipyw_jlab_nb_ext_conf=0.1.0=py36he11e457_0
  - alabaster=0.7.10=py36h306e16b_0
  - anaconda=5.2.0=py36_3
  - anaconda-client=1.6.14=py36_0
  - anaconda-navigator=1.8.7=py36_0
  - anaconda-project=0.8.2=py36h44fb852_0
  - asn1crypto=0.24.0=py36_0
  - astroid=1.6.3=py36_0
  - astropy=3.0.2=py36h3010b51_1
  - attrs=18.1.0=py36_0
  - babel=2.5.3=py36_0
  - backcall=0.1.0=py36_0
  - backports=1.0=py36hfa02d7e_1
  - backports.shutil_get_terminal_size=1.0.0=py36hfea85ff_2
  - beautifulsoup4=4.6.0=py36h49b8c8c_1
  - bitarray=0.8.1=py36h14c3975_1
  - bkcharts=0.2=py36h735825a_0
  - blas=1.0=mkl
  - blaze=0.11.3=py36h4e06776_0
  - bleach=2.1.3=py36_0
  - blosc=1.14.3=hdbcaa40_0
  - bokeh=0.12.16=py36_0
  - boto=2.48.0=py36h6e4cd66_1
  - bottleneck=1.2.1=py36haac1ea0_0
  - bzip2=1.0.6=h14c3975_5
  - ca-certificates=2018.03.07=0
  - cairo=1.14.12=h7636065_2
  - certifi=2018.4.16=py36_0
  - cffi=1.11.5=py36h9745a5d_0
  - chardet=3.0.4=py36h0f667ec_1
  - click=6.7=py36h5253387_0
  - cloudpickle=0.5.3=py36_0
  - clyent=1.2.2=py36h7e57e65_1
  - colorama=0.3.9=py36h489cec4_0
  - conda=4.5.4=py36_0
  - conda-build=3.10.5=py36_0
  - conda-env=2.6.0=h36134e3_1
  - conda-verify=2.0.0=py36h98955d8_0
  - contextlib2=0.5.5=py36h6c84a62_0
  - cryptography=2.2.2=py36h14c3975_0
  - cudatoolkit=7.5=2
  - cudnn=5.1=0
  - curl=7.60.0=h84994c4_0
  - cycler=0.10.0=py36h93f1223_0
  - cython=0.28.2=py36h14c3975_0
  - cytoolz=0.9.0.1=py36h14c3975_0
  - dask=0.17.5=py36_0
  - dask-core=0.17.5=py36_0
  - datashape=0.5.4=py36h3ad6b5c_0
  - dbus=1.13.2=h714fa37_1
  - decorator=4.3.0=py36_0
  - distributed=1.21.8=py36_0
  - docutils=0.14=py36hb0f60f5_0
  - entrypoints=0.2.3=py36h1aec115_2
  - et_xmlfile=1.0.1=py36hd6bccc3_0
  - expat=2.2.5=he0dffb1_0
  - fastcache=1.0.2=py36h14c3975_2
  - ffmpeg=4.0=h04d0a96_0
  - filelock=3.0.4=py36_0
  - flask=1.0.2=py36_1
  - flask-cors=3.0.4=py36_0
  - fontconfig=2.12.6=h49f89f6_0
  - freetype=2.8=hab7d2ae_1
  - get_terminal_size=1.0.0=haa9412d_0
  - gevent=1.3.0=py36h14c3975_0
  - glib=2.56.1=h000015b_0
  - glob2=0.6=py36he249c77_0
  - gmp=6.1.2=h6c8ec71_1
  - gmpy2=2.0.8=py36hc8893dd_2
  - graphite2=1.3.11=h16798f4_2
  - greenlet=0.4.13=py36h14c3975_0
  - gst-plugins-base=1.14.0=hbbd80ab_1
  - gstreamer=1.14.0=hb453b48_1
  - h5py=2.7.1=py36ha1f6525_2
  - harfbuzz=1.7.6=h5f0a787_1
  - hdf5=1.10.2=hba1933b_1
  - heapdict=1.0.0=py36_2
  - html5lib=1.0.1=py36h2f9c1c0_0
  - icu=58.2=h9c2bf20_1
  - idna=2.6=py36h82fb2a8_1
  - imageio=2.3.0=py36_0
  - imagesize=1.0.0=py36_0
  - intel-openmp=2018.0.0=8
  - ipykernel=4.8.2=py36_0
  - ipython=6.4.0=py36_0
  - ipython_genutils=0.2.0=py36hb52b0d5_0
  - ipywidgets=7.2.1=py36_0
  - isort=4.3.4=py36_0
  - itsdangerous=0.24=py36h93cc618_1
  - jbig=2.1=hdba287a_0
  - jdcal=1.4=py36_0
  - jedi=0.12.0=py36_1
  - jinja2=2.10=py36ha16c418_0
  - jpeg=9b=h024ee3a_2
  - jsonschema=2.6.0=py36h006f8b5_0
  - jupyter=1.0.0=py36_4
  - jupyter_client=5.2.3=py36_0
  - jupyter_console=5.2.0=py36he59e554_1
  - jupyter_core=4.4.0=py36h7c827e3_0
  - jupyterlab=0.32.1=py36_0
  - jupyterlab_launcher=0.10.5=py36_0
  - kiwisolver=1.0.1=py36h764f252_0
  - lazy-object-proxy=1.3.1=py36h10fcdad_0
  - libcurl=7.60.0=h1ad7b7a_0
  - libedit=3.1.20170329=h6b74fdf_2
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=7.2.0=hdf63c60_3
  - libgfortran-ng=7.2.0=hdf63c60_3
  - libopus=1.2.1=hb9ed12e_0
  - libpng=1.6.34=hb9fc6fc_0
  - libprotobuf=3.5.2=h6f1eeef_0
  - libsodium=1.0.16=h1bed415_0
  - libssh2=1.8.0=h9cfc8f7_4
  - libstdcxx-ng=7.2.0=hdf63c60_3
  - libtiff=4.0.9=he85c1e1_1
  - libtool=2.4.6=h544aabb_3
  - libvpx=1.7.0=h439df22_0
  - libxcb=1.13=h1bed415_1
  - libxml2=2.9.8=h26e45fe_1
  - libxslt=1.1.32=h1312cb7_0
  - llvmlite=0.23.1=py36hdbcaa40_0
  - locket=0.2.0=py36h787c0ad_1
  - lxml=4.2.1=py36h23eabaa_0
  - lzo=2.10=h49e0be7_2
  - markupsafe=1.0=py36hd9260cd_1
  - matplotlib=2.2.2=py36h0e671d2_1
  - mccabe=0.6.1=py36h5ad9710_1
  - mistune=0.8.3=py36h14c3975_1
  - mkl=2018.0.2=1
  - mkl-service=1.1.2=py36h17a0993_4
  - mkl_fft=1.0.1=py36h3010b51_0
  - mkl_random=1.0.1=py36h629b387_0
  - more-itertools=4.1.0=py36_0
  - mpc=1.0.3=hec55b23_5
  - mpfr=3.1.5=h11a74b3_2
  - mpmath=1.0.0=py36hfeacd6b_2
  - msgpack-python=0.5.6=py36h6bb024c_0
  - multipledispatch=0.5.0=py36_0
  - navigator-updater=0.2.1=py36_0
  - nbconvert=5.3.1=py36hb41ffb7_0
  - nbformat=4.4.0=py36h31c9010_0
  - ncurses=6.1=hf484d3e_0
  - networkx=2.1=py36_0
  - nltk=3.3.0=py36_0
  - nose=1.3.7=py36hcdf7029_2
  - notebook=5.5.0=py36_0
  - numba=0.38.0=py36h637b7d7_0
  - numexpr=2.6.5=py36h7bf3b9c_0
  - numpy=1.14.3=py36hcd700cb_1
  - numpy-base=1.14.3=py36h9be14a7_1
  - numpydoc=0.8.0=py36_0
  - odo=0.5.1=py36h90ed295_0
  - olefile=0.45.1=py36_0
  - openpyxl=2.5.3=py36_0
  - openssl=1.0.2o=h20670df_0
  - packaging=17.1=py36_0
  - pandas=0.23.0=py36h637b7d7_0
  - pandoc=1.19.2.1=hea2e7c5_1
  - pandocfilters=1.4.2=py36ha6701b7_1
  - pango=1.41.0=hd475d92_0
  - parso=0.2.0=py36_0
  - partd=0.3.8=py36h36fd896_0
  - patchelf=0.9=hf79760b_2
  - path.py=11.0.1=py36_0
  - pathlib2=2.3.2=py36_0
  - patsy=0.5.0=py36_0
  - pcre=8.42=h439df22_0
  - pep8=1.7.1=py36_0
  - pexpect=4.5.0=py36_0
  - pickleshare=0.7.4=py36h63277f8_0
  - pillow=5.1.0=py36h3deb7b8_0
  - pip=10.0.1=py36_0
  - pixman=0.34.0=hceecf20_3
  - pkginfo=1.4.2=py36_1
  - pluggy=0.6.0=py36hb689045_0
  - ply=3.11=py36_0
  - prompt_toolkit=1.0.15=py36h17d85b1_0
  - protobuf=3.5.2=py36hf484d3e_0
  - psutil=5.4.5=py36h14c3975_0
  - ptyprocess=0.5.2=py36h69acd42_0
  - py=1.5.3=py36_0
  - pycodestyle=2.4.0=py36_0
  - pycosat=0.6.3=py36h0a5515d_0
  - pycparser=2.18=py36hf9f622e_1
  - pycrypto=2.6.1=py36h14c3975_8
  - pycurl=7.43.0.1=py36hb7f436b_0
  - pyflakes=1.6.0=py36h7bd6a15_0
  - pygments=2.2.0=py36h0d3125c_0
  - pylint=1.8.4=py36_0
  - pyodbc=4.0.23=py36hf484d3e_0
  - pyopenssl=18.0.0=py36_0
  - pyparsing=2.2.0=py36hee85983_1
  - pyqt=5.9.2=py36h751905a_0
  - pysocks=1.6.8=py36_0
  - pytables=3.4.3=py36h02b9ad4_2
  - pytest=3.5.1=py36_0
  - pytest-arraydiff=0.2=py36_0
  - pytest-astropy=0.3.0=py36_0
  - pytest-doctestplus=0.1.3=py36_0
  - pytest-openfiles=0.3.0=py36_0
  - pytest-remotedata=0.2.1=py36_0
  - python=3.6.5=hc3d631a_2
  - python-dateutil=2.7.3=py36_0
  - pytz=2018.4=py36_0
  - pywavelets=0.5.2=py36he602eb0_0
  - pyyaml=3.12=py36hafb9ca4_1
  - pyzmq=17.0.0=py36h14c3975_0
  - qt=5.9.5=h7e424d6_0
  - qtawesome=0.4.4=py36h609ed8c_0
  - qtconsole=4.3.1=py36h8f73b5b_0
  - qtpy=1.4.1=py36_0
  - readline=7.0=ha6073c6_4
  - requests=2.18.4=py36he2e5f8d_1
  - rope=0.10.7=py36h147e2ec_0
  - ruamel_yaml=0.15.35=py36h14c3975_1
  - scikit-image=0.13.1=py36h14c3975_1
  - scikit-learn=0.19.1=py36h7aa7ec6_0
  - scipy=1.1.0=py36hfc37229_0
  - seaborn=0.8.1=py36hfad7ec4_0
  - send2trash=1.5.0=py36_0
  - setuptools=39.1.0=py36_0
  - simplegeneric=0.8.1=py36_2
  - singledispatch=3.4.0.3=py36h7a266c3_0
  - sip=4.19.8=py36hf484d3e_0
  - six=1.11.0=py36h372c433_1
  - snappy=1.1.7=hbae5bb6_3
  - snowballstemmer=1.2.1=py36h6febd40_0
  - sortedcollections=0.6.1=py36_0
  - sortedcontainers=1.5.10=py36_0
  - sphinx=1.7.4=py36_0
  - sphinxcontrib=1.0=py36h6d0f590_1
  - sphinxcontrib-websupport=1.0.1=py36hb5cb234_1
  - spyder=3.2.8=py36_0
  - sqlalchemy=1.2.7=py36h6b74fdf_0
  - sqlite=3.23.1=he433501_0
  - statsmodels=0.9.0=py36h3010b51_0
  - sympy=1.1.1=py36hc6d1c1c_0
  - tblib=1.3.2=py36h34cf8b6_0
  - tensorflow-gpu=1.0.1=py36_4
  - terminado=0.8.1=py36_1
  - testpath=0.3.1=py36h8cadb63_0
  - tk=8.6.7=hc745277_3
  - toolz=0.9.0=py36_0
  - tornado=5.0.2=py36_0
  - tqdm=4.23.4=py36_0
  - traitlets=4.3.2=py36h674d592_0
  - typing=3.6.4=py36_0
  - unicodecsv=0.14.1=py36ha668878_0
  - unixodbc=2.3.6=h1bed415_0
  - urllib3=1.22=py36hbe7ace6_0
  - wcwidth=0.1.7=py36hdf4376a_0
  - webencodings=0.5.1=py36h800622e_1
  - werkzeug=0.14.1=py36_0
  - wheel=0.31.1=py36_0
  - widgetsnbextension=3.2.1=py36_0
  - wrapt=1.10.11=py36h28b7045_0
  - xlrd=1.1.0=py36h1db9f0c_1
  - xlsxwriter=1.0.4=py36_0
  - xlwt=1.3.0=py36h7b00a1f_0
  - xz=5.2.4=h14c3975_4
  - yaml=0.1.7=had09818_2
  - zeromq=4.2.5=h439df22_0
  - zict=0.1.3=py36h3a3bf81_0
  - zlib=1.2.11=ha838bed_2
  - pip:
    - easydict==1.7
    - tables==3.4.3
    - tensorflow==1.0.1

Step 7 Program displayed: "Folder Already exist!"

Hi,

I am good until your Step 6 running DeepLabCut on Windows. As I proceed to step 7, it terminates with a display of "Folder Already exist". It was also noted that only the first image was analyzed with 0.0%!

When I run Step2_AnalysisofResults.py, it says "IndexError: Index out of range".

Can you please advise what went wrong?

Thanks in advance.

Best Regards,

David

Fraction of images used as training and test sets seems to be wrong

Describe the bug
Fraction of images used as training and test sets seems to be wrong when using the scripts in Generating_a_Training_Set. In particular, setting training fraction to 0.95 seems to give 5% training and 95% test.

To Reproduce
I followed the tutorial at https://alexemg.github.io/DeepLabCut/docs/demo-guide.html using my own video. The file UnaugmentedDataSet_2choiceAug21/Documentation_data-2choice_95shuffle1.pickle seems wrong to me:

>>> pickle.load(open("UnaugmentedDataSet_2choiceAug21/Documentation_data-2choice_95shuffle1.pickle", "rb"))

[[{'image': '../../UnaugmentedDataSet_2choiceAug21/data-2choice/20180404_144847_d1_3517-0000/img31550.png',
   'joints': array([[  0, 151, 103],
          [  1, 231, 160],
          [  2, 278, 191],
          [  3, 261, 218]]),
   'size': array([  3, 304, 400])},
  {'image': '../../UnaugmentedDataSet_2choiceAug21/data-2choice/20180404_144847_d1_3517-0000/img48931.png',
   'joints': array([[  0, 187, 240],
          [  1, 228, 153],
          [  2, 261,  80],
          [  3, 271, 111]]),
   'size': array([  3, 304, 400])},
  {'image': '../../UnaugmentedDataSet_2choiceAug21/data-2choice/20180404_144847_d1_3517-0000/img13289.png',
   'joints': array([[  0, 195, 241],
          [  1, 239, 152],
          [  2, 258,  87],
          [  3, 277, 110]]),
   'size': array([  3, 304, 400])}],
 array([14, 28,  3]),
 array([34, 19, 26, 12, 33, 22, 38, 39, 13, 30,  9, 17,  1, 32, 27, 23,  7,
         5, 35,  8, 25, 10, 24, 21, 37, 18, 20, 31, 29, 42,  0, 40, 15,  2,
         6, 11, 43, 41,  4, 16, 36]),
 0.95]

I would expect the last two arrays to be switched, and that the listed images would reflect the bigger split.

Desktop (please complete the following information):

  • OS: Ubuntu 18.10

Thanks for a promising package!

Classification of behaviors tracked by DLC

I have been using your software for a little under a month now and have been able to train it to track a few different behaviors. However, it does not appear that the software can classify behaviors based on the pose-data. My lab would like to use your software to detect and classify certain behaviors so that we can run more targeted experiments on our mice. The idea is that based on the pose-data, the software can classify the behavior it's tracking. For example, if I were to pick out a random frame from the analyzed video, it would tell me that at this frame the mouse was rearing, grooming, etc. The closest program that I found with this capability is this:

print: http://serre-lab.clps.brown.edu/wp-content/uploads/2012/10/ncomms1064.pdf
code: http://cbcl.mit.edu/software-datasets/mouse/

To be completely honest, I am not sure how difficult this will be to implement into your code but I think it would be a rather useful addition.

More than happy to chat more about this if you'd like a more detailed description of the feature request!

Best,

DavidSBU

cuDNN launch failure : input shape([1,3,395,536]) filter shape([7,7,3,64])

Hi there,
When trying to retrain the network using the example labels - just to test if the installation is ok - I get a mismatch error like that:

(tensorflow) mic@mic-OptiPlex-9010:~/DeepLabCut/pose-tensorflow/models/reachingJan30-trainset95shuffle1/train$ TF_CUDNN_USE_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=0 python3 ../../../train.py
/home/mic/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
WARNING:tensorflow:From /home/mic/.local/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Config:
{'all_joints': [[0], [1], [2], [3]],
 'all_joints_names': ['hand', 'Finger1', 'Finger2', 'Joystick'],
 'batch_size': 1,
 'crop': False,
 'crop_pad': 0,
 'dataset': '../../UnaugmentedDataSet_reachingJan30/reaching_Mackenzie95shuffle1.mat',
 'dataset_type': 'default',
 'display_iters': 5000,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '../../pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1000,
 'mean_pixel': [123.68, 116.779, 103.939],
 'mirror': False,
 'multi_step': [[0.005, 10000],
                [0.02, 430000],
                [0.002, 730000],
                [0.001, 1030000]],
 'net_type': 'resnet_50',
 'num_joints': 4,
 'optimizer': 'sgd',
 'pos_dist_thresh': 17,
 'regularize': False,
 'save_iters': 50000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.5,
 'scoremap_dir': 'test',
 'shuffle': True,
 'snapshot_prefix': './snapshot',
 'stride': 8.0,
 'use_gt_segm': False,
 'video': False,
 'video_batch': False,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
2018-04-12 16:28:32.944642: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-12 16:28:32.944900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.33GiB
2018-04-12 16:28:32.944919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-12 16:28:33.373499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-12 16:28:33.373536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-12 16:28:33.373543: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-12 16:28:33.373694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1088 MB memory) -> physical GPU (device: 0, name: Quadro K620, pci bus id: 0000:01:00.0, compute capability: 5.0)
INFO:tensorflow:Restoring parameters from ../../pretrained/resnet_v1_50.ckpt
Restoring parameters from ../../pretrained/resnet_v1_50.ckpt
2018-04-12 16:28:38.363988: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7005 (compatibility version 7000).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2018-04-12 16:28:38.364664: W ./tensorflow/stream_executor/stream.h:2018] attempting to perform DNN operation using StreamExecutor without DNN support
Traceback (most recent call last):
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
    return fn(*args)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1312, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1420, in _call_tf_sessionrun
    status, run_metadata)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape([1,3,395,536]) filter shape([7,7,3,64])
	 [[Node: resnet_v1_50/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_50/conv1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, resnet_v1_50/conv1/weights/read)]]
	 [[Node: add/_763 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1602_add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "../../../train.py", line 140, in <module>
    train()
  File "../../../train.py", line 119, in train
    feed_dict={learning_rate: current_lr})
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1140, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    run_metadata)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape([1,3,395,536]) filter shape([7,7,3,64])
	 [[Node: resnet_v1_50/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_50/conv1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, resnet_v1_50/conv1/weights/read)]]
	 [[Node: add/_763 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1602_add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'resnet_v1_50/conv1/Conv2D', defined at:
  File "../../../train.py", line 140, in <module>
    train()
  File "../../../train.py", line 85, in train
    losses = pose_net(cfg).train(batch)
  File "/home/mic/DeepLabCut/pose-tensorflow/nnet/pose_net.py", line 96, in train
    heads = self.get_net(batch[Batch.inputs])
  File "/home/mic/DeepLabCut/pose-tensorflow/nnet/pose_net.py", line 85, in get_net
    net, end_points = self.extract_features(inputs)
  File "/home/mic/DeepLabCut/pose-tensorflow/nnet/pose_net.py", line 58, in extract_features
    global_pool=False, output_stride=16,is_training=False)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 274, in resnet_v1_50
    scope=scope)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 205, in resnet_v1
    net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 146, in conv2d_same
    scope=scope)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
    return func(*args, **current_args)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1049, in convolution
    outputs = layer.apply(inputs)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 825, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 714, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/layers/convolutional.py", line 168, in call
    outputs = self._convolution_op(inputs, self.kernel)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 870, in __call__
    return self.conv_op(inp, filter)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 522, in __call__
    return self.call(inp, filter)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 206, in __call__
    name=self.name)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 953, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3290, in create_op
    op_def=op_def)
  File "/home/mic/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1654, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InternalError (see above for traceback): cuDNN launch failure : input shape([1,3,395,536]) filter shape([7,7,3,64])
	 [[Node: resnet_v1_50/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_50/conv1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, resnet_v1_50/conv1/weights/read)]]
	 [[Node: add/_763 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1602_add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Any hints greatly appreciated!

Video file path not found in Analysis_tools/MakingLabeledVideo.py

Describe the problem
I think there is a bug in the MakingLabeledVideo.py and MakingLabeledVideo_fast.py
It basically doesn't generate the video for me due to a path syntax error (I think).

Traceback
C:\Users\ua12\Dropbox (Duke Bio_Ea)\Bhandawat Lab\Umar\Week of 8-13\DeepLabCut\Analysis-tools>py MakingLabeledVideo.py
Starting ../videos/ ['4.avi']
Loading 4.avi and data.
The video was not analyzed with this scorer: DeepCut_resnet50_leg Wing TrackingAug15shuffle1_500
Other scorers were found, however: ['4DeepCut_resnet50_leg Wing TrackingAug15shuffle1_50000.h5']
Creating labeled video for: 4DeepCut_resnet50_leg Wing TrackingAug15shuffle1_50000.h5 instead.
Duration of video [s]: 32.23 , recorded with 30.0 fps!
Overall # of frames: 967 with cropped frame dimensions: [1024, 736]
Generating frames
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 967/967 [04:03<00:00, 3.97it/s]
Generating video
Traceback (most recent call last):
File "MakingLabeledVideo.py", line 158, in
Dataframe = pd.read_hdf(dataname)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 347, in read_hdf
'File %s does not exist' % path_or_buf)
FileNotFoundError: File 4DeepCut_resnet50_leg Wing TrackingAug15shuffle1_500.h5 does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "MakingLabeledVideo.py", line 173, in
CreateVideo(clip,Dataframe)
File "MakingLabeledVideo.py", line 131, in CreateVideo
str(clip.fps), '-i', 'file%04d.png', '-r', '30', '../'+vname + '_DeepLabCutlabeled.mp4'])
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 709, in init
restore_signals, start_new_session)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Additional Comments
Any help would be appreciated in debugging this. I managed to train a model succesfully but am having trouble generating the videos for analysis.

viewing iterations during training

I am able to successfully prepare the dataset (both the given example and a new one), but when I train the model using train.py using properly set up tensorflow-gpu and keras, the model operates very slowly.

Output of training (on sample dataset):

$ TF_CUDNN_USE_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=0 python ../../../train.py
WARNING:tensorflow:From C:\Users\MINDRE~1\Envs\paper1\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Config:
{'all_joints': [[0], [1], [2], [3]],
 'all_joints_names': ['hand', 'Finger1', 'Finger2', 'Joystick'],
 'batch_size': 1,
 'crop': False,
 'crop_pad': 0,
 'dataset': '../../UnaugmentedDataSet_reachingJan30/reaching_Mackenzie95shuffle1.mat',
 'dataset_type': 'default',
 'display_iters': 5000,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '../../pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1000,
 'mean_pixel': [123.68, 116.779, 103.939],
 'mirror': False,
 'multi_step': [[0.005, 10000],
                [0.02, 430000],
                [0.002, 730000],
                [0.001, 1030000]],
 'net_type': 'resnet_50',
 'num_joints': 4,
 'optimizer': 'sgd',
 'pos_dist_thresh': 17,
 'regularize': False,
 'save_iters': 50000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.5,
 'scoremap_dir': 'test',
 'shuffle': True,
 'snapshot_prefix': './snapshot',
 'stride': 8.0,
 'use_gt_segm': False,
 'video': False,
 'video_batch': False,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
2018-04-18 18:29:11.947470: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-04-18 18:29:12.496256: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1344] Found device 0 with properties:
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:08:00.0
totalMemory: 12.00GiB freeMemory: 11.50GiB
2018-04-18 18:29:12.496584: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-18 18:29:13.123674: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-18 18:29:13.124026: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917]      0
2018-04-18 18:29:13.124220: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0:   N
2018-04-18 18:29:13.124542: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11143 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:08:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from ../../pretrained/resnet_v1_50.ckpt
Restoring parameters from ../../pretrained/resnet_v1_50.ckpt
iteration: 0 loss: 0.0002 lr: 0.005

The process gets stuck on iteration 0, and there is little Python CPU activity and no GPU activity. When I modify train.py to print every iteration, it clearly iterates but the losses and learning rates do not change. As far as I know, my computer is recognizing the GPU.

If this issue should be directed to the people who made pose-tensorflow then I will inform them as well.

No code to take FIJI output to bodypart files

Could be missing something, but I don't see code to split the single .csv output by FIJI into the bodypart files that the Step2 script looks for. if it's needed:

import pandas as pd
import Tkinter, tkFileDialog
import sys
import os
sys.path.append(os.getcwd().split('Generating_a_Training_Set')[0])
from myconfig import Task, bodyparts

# ask user for location of single .csv output from fiji
root = Tkinter.Tk()
filename = tkFileDialog.askopenfilename(parent=root, initialdir="/",
                                        title='Please select a csv file')

# get basefolder, and the name of the data folder assuming there is only 1 video for now
basefolder = 'data-' + Task + '/'
folder = [name for name in os.listdir(basefolder) if os.path.isdir(os.path.join(basefolder, name))][0]

# load csv, iterate over nth value in a grouping by frame, save to bodyparts files
dframe = pd.read_csv(filename)
frame_grouped = dframe.groupby('Slice')
for i, part in enumerate(bodyparts):
    part_df = frame_grouped.nth(i)
    part_fn = basefolder + folder + '/{}.csv'.format(part)
    part_df.to_csv(part_fn)

What loss is small enough?

Hello,

The training has been running for 24h or so, and I'm wondering what loss is small enough. I'm now at

iteration: 944000 loss: 0.0005 lr: 0.001

Can I just abort that now (Ctrl+C) and take the weights as they are or do I need to wait until the end of the training?

Cheers

FileNotFoundError: File *.h5 does not exist (MakingLabeledVideo.py fails if ffmpeg not installed)

Describe the problem
I am in the process of analyzing my videoes, but have a problem. I don't know what kind of an issue this is as this isn't my area of expertise.

To Reproduce
Steps to reproduce the behavior:

  1. Configure myconfig_analysis.py (videotype='.mp4', scorer, task and date)
  2. On elevated Anaconda Prompt, activate project env
  3. set CUDA_VISIBLE_DEVICES=0
  4. cd DeepLabCut/Analysis-tools
  5. python AnalyzeVideos.py
  6. python MakingLabeledVideo.py
  7. Error: FileNotFoundError: File SocialNov1210_middleDeepCut_resnet50_socialSep02shuffle1_500.h5 does not exist
    also:
    FileNotFoundError: [WinError 2] The system cannot find the file specified

Screenshots
trouble

Additional context
The video files are cropped. They are encoded from .avi to .mp4 with HEVC (H.265 format)

PC spec:
Windows 10 Pro
Ryzen 1700
gtx 1080 ti

all points are occluded in one frame (relates to old version of DLC)

Hi there,

Assume all four points that I want to track are occluded in a frame. Then typing

~/DeepLabCut-master/pose-tensorflow/models/aliAugust10-trainset95shuffle1/train$ TF_CUDNN_USE_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=0 python ../../../train.py

results in this error:

 Traceback (most recent call last):
  File "../../../train.py", line 140, in <module>
    train()
  File "../../../train.py", line 80, in train
    dataset = create_dataset(cfg)
  File "/home/mic/DeepLabCut-master/pose-tensorflow/dataset/factory.py", line 16, in create
    data = PoseDataset(cfg)
  File "/home/mic/DeepLabCut-master/pose-tensorflow/dataset/pose_dataset.py", line 50, in __init__
    self.data = self.load_dataset()
  File "/home/mic/DeepLabCut-master/pose-tensorflow/dataset/pose_dataset.py", line 78, in load_dataset
    joint_id = joints[:, 0]
IndexError: index 0 is out of bounds for axis 1 with size 0

So one needs to change something to handle that empty joint index list.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.